Remember what you did so you know what to do next

Manuel Ciosici, Alex Hedges, Yash Kankanampati, Justin Martin, Marjorie Freedman, Ralph Weischedel


Abstract
We explore using the 6B parameter GPT-J language model to create a plan for a simulated robot to achieve 30 classes of goals in ScienceWorld, a text game simulator for elementary science experiments and for which previously published empirical work has shown large language models (LLM)s to be a poor fit (Wang et al., 2022). Using the Markov assumption, the LLM outperforms the state-of-the-art based on reinforcement learning by a factor of 1.4. When we fill the LLM’s input buffer with as many prior steps as will fit, improvement rises to 3.3x. Even when training on only 6.5% of the training data, we observe a 2.3x improvement over the state-of-the-art. Our experiments show that performance varies widely across the 30 classes of actions, indicating that averaging over tasks can hide significant performance issues.
Anthology ID:
2023.findings-emnlp.104
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1550–1562
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.104
DOI:
10.18653/v1/2023.findings-emnlp.104
Bibkey:
Cite (ACL):
Manuel Ciosici, Alex Hedges, Yash Kankanampati, Justin Martin, Marjorie Freedman, and Ralph Weischedel. 2023. Remember what you did so you know what to do next. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1550–1562, Singapore. Association for Computational Linguistics.
Cite (Informal):
Remember what you did so you know what to do next (Ciosici et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.104.pdf