{"date_created":"2024-01-05T12:38:42Z","external_id":{"arxiv":["2309.17207"]},"_id":"50221","status":"public","year":"2023","author":[{"last_name":"Pleines","full_name":"Pleines, Marco","first_name":"Marco"},{"last_name":"Pallasch","full_name":"Pallasch, Matthias","first_name":"Matthias"},{"full_name":"Zimmer, Frank","last_name":"Zimmer","first_name":"Frank"},{"first_name":"Mike","last_name":"Preuss","full_name":"Preuss, Mike"}],"user_id":"67287","type":"preprint","citation":{"mla":"Pleines, Marco, et al. “Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents.” ArXiv:2309.17207, 2023.","chicago":"Pleines, Marco, Matthias Pallasch, Frank Zimmer, and Mike Preuss. “Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents.” ArXiv:2309.17207, 2023.","apa":"Pleines, M., Pallasch, M., Zimmer, F., & Preuss, M. (2023). Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents. In arXiv:2309.17207.","short":"M. Pleines, M. Pallasch, F. Zimmer, M. Preuss, ArXiv:2309.17207 (2023).","bibtex":"@article{Pleines_Pallasch_Zimmer_Preuss_2023, title={Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents}, journal={arXiv:2309.17207}, author={Pleines, Marco and Pallasch, Matthias and Zimmer, Frank and Preuss, Mike}, year={2023} }","ieee":"M. Pleines, M. Pallasch, F. Zimmer, and M. Preuss, “Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents,” arXiv:2309.17207. 2023.","ama":"Pleines M, Pallasch M, Zimmer F, Preuss M. Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of  Agents. arXiv:230917207. Published online 2023."},"project":[{"name":"PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing","_id":"52"}],"department":[{"_id":"27"}],"language":[{"iso":"eng"}],"date_updated":"2024-01-05T12:39:50Z","title":"Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents","publication":"arXiv:2309.17207","abstract":[{"lang":"eng","text":"Memory Gym presents a suite of 2D partially observable environments, namely\r\nMortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark\r\nmemory capabilities in decision-making agents. These environments, originally\r\nwith finite tasks, are expanded into innovative, endless formats, mirroring the\r\nescalating challenges of cumulative memory games such as ``I packed my bag''.\r\nThis progression in task design shifts the focus from merely assessing sample\r\nefficiency to also probing the levels of memory effectiveness in dynamic,\r\nprolonged scenarios. To address the gap in available memory-based Deep\r\nReinforcement Learning baselines, we introduce an implementation that\r\nintegrates Transformer-XL (TrXL) with Proximal Policy Optimization. This\r\napproach utilizes TrXL as a form of episodic memory, employing a sliding window\r\ntechnique. Our comparative study between the Gated Recurrent Unit (GRU) and\r\nTrXL reveals varied performances across different settings. TrXL, on the finite\r\nenvironments, demonstrates superior sample efficiency in Mystery Path and\r\noutperforms in Mortar Mayhem. However, GRU is more efficient on Searing\r\nSpotlights. Most notably, in all endless tasks, GRU makes a remarkable\r\nresurgence, consistently outperforming TrXL by significant margins. Website and\r\nSource Code: https://github.com/MarcoMeter/endless-memory-gym/"}]}