Do Trajectories Encode Verb Meaning?

Dylan Ebert, Chen Sun, Ellie Pavlick


Abstract
Distributional models learn representations of words from text, but are criticized for their lack of grounding, or the linking of text to the non-linguistic world. Grounded language models have had success in learning to connect concrete categories like nouns and adjectives to the world via images and videos, but can struggle to isolate the meaning of the verbs themselves from the context in which they typically occur. In this paper, we investigate the extent to which trajectories (i.e. the position and rotation of objects over time) naturally encode verb semantics. We build a procedurally generated agent-object-interaction dataset, obtain human annotations for the verbs that occur in this data, and compare several methods for representation learning given the trajectories. We find that trajectories correlate as-is with some verbs (e.g., fall), and that additional abstraction via self-supervised pretraining can further capture nuanced differences in verb meaning (e.g., roll and slide).
Anthology ID:
2022.naacl-main.206
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2860–2871
Language:
URL:
https://aclanthology.org/2022.naacl-main.206
DOI:
10.18653/v1/2022.naacl-main.206
Bibkey:
Cite (ACL):
Dylan Ebert, Chen Sun, and Ellie Pavlick. 2022. Do Trajectories Encode Verb Meaning?. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2860–2871, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Do Trajectories Encode Verb Meaning? (Ebert et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.206.pdf
Code
 dylanebert/simulated