AND does not mean OR: Using Formal Languages to Study Language Models’ Representations

Aaron Traylor, Roman Feiman, Ellie Pavlick


Abstract
A current open question in natural language processing is to what extent language models, which are trained with access only to the form of language, are able to capture the meaning of language. This question is challenging to answer in general, as there is no clear line between meaning and form, but rather meaning constrains form in consistent ways. The goal of this study is to offer insights into a narrower but critical subquestion: Under what conditions should we expect that meaning and form covary sufficiently, such that a language model with access only to form might nonetheless succeed in emulating meaning? Focusing on several formal languages (propositional logic and a set of programming languages), we generate training corpora using a variety of motivated constraints, and measure a distributional language model’s ability to differentiate logical symbols (AND, OR, and NOT). Our findings are largely negative: none of our simulated training corpora result in models which definitively differentiate meaningfully different symbols (e.g., AND vs. OR), suggesting a limitation to the types of semantic signals that current models are able to exploit.
Anthology ID:
2021.acl-short.21
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
158–167
Language:
URL:
https://aclanthology.org/2021.acl-short.21
DOI:
10.18653/v1/2021.acl-short.21
Bibkey:
Cite (ACL):
Aaron Traylor, Roman Feiman, and Ellie Pavlick. 2021. AND does not mean OR: Using Formal Languages to Study Language Models’ Representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 158–167, Online. Association for Computational Linguistics.
Cite (Informal):
AND does not mean OR: Using Formal Languages to Study Language Models’ Representations (Traylor et al., ACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-short.21.pdf
Video:
 https://aclanthology.org/2021.acl-short.21.mp4