OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking

Xueqing Wu; Sha Li; Heng Ji

doi:10.18653/v1/2023.findings-acl.452

OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking

Abstract

Open-vocabulary state tracking is a more practical version of state tracking that aims to track state changes of entities throughout a process without restricting the state space and entity space. OpenPI (Tandon et al., 2020) is to date the only dataset annotated for open-vocabulary state tracking. However, we identify issues with the dataset quality and evaluation metric. For the dataset, we categorize 3 types of problems on the procedure level, step level and state change level respectively, and build a clean dataset OpenPI-C using multiple rounds of human judgment. For the evaluation metric, we propose a cluster-based metric to fix the original metric’s preference for repetition. Model-wise, we enhance the seq2seq generation baseline by reinstating two key properties for state tracking: temporal dependency and entity awareness. The state of the world after an action is inherently dependent on the previous state. We model this dependency through a dynamic memory bank and allow the model to attend to the memory slots during decoding. On the other hand, the state of the world is naturally a union of the states of involved entities. Since the entities are unknown in the open-vocabulary setting, we propose a two-stage model that refines the state change prediction conditioned on entities predicted from the first stage. Empirical results show the effectiveness of our proposed model, especially on the cleaned dataset and the cluster-based metric. The code and data are released at https://github.com/shirley-wu/openpi-c

Anthology ID:: 2023.findings-acl.452
Original:: 2023.findings-acl.452v1
Version 2:: 2023.findings-acl.452v2
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7213–7222
Language:
URL:: https://aclanthology.org/2023.findings-acl.452/
DOI:: 10.18653/v1/2023.findings-acl.452
Bibkey:
Cite (ACL):: Xueqing Wu, Sha Li, and Heng Ji. 2023. OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7213–7222, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking (Wu et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.452.pdf
Video:: https://aclanthology.org/2023.findings-acl.452.mp4

PDF (v2) PDF (v1) Cite Search Video Fix data