@inproceedings{toshniwal-etal-2022-baked,
title = "Baked-in State Probing",
author = "Toshniwal, Shubham and
Wiseman, Sam and
Livescu, Karen and
Gimpel, Kevin",
editor = "Goldberg, Yoav and
Kozareva, Zornitsa and
Zhang, Yue",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-emnlp.397",
doi = "10.18653/v1/2022.findings-emnlp.397",
pages = "5430--5435",
abstract = "Neural language models have been analyzed for their linguistic and extra-linguistic knowledge via probing. Of particular interest has been the following question: how much can a language model trained only on form learn about meaning? Recent work has demonstrated via probing classifiers that in the setting of simple procedural text, where by {``}meaning{''} we mean the underlying world state, language models have a non-trivial performance on world state tracking. However, our proposed evaluation based on model predictions shows differing results, suggesting that these models are either not capturing the world state or not using it. How do these results change if the model has access to the world state? We explore this alternate setting with access to the underlying world state only during training and investigate ways of {``}baking in{''} the state knowledge along with the primary task of language modeling. Our proposed approaches allow for state probing during inference simply via text prompts, avoiding any probing classifier machinery. In terms of performance, we show that baking in the state knowledge during training leads to significant improvements in state tracking performance and text generation quality,",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="toshniwal-etal-2022-baked">
<titleInfo>
<title>Baked-in State Probing</title>
</titleInfo>
<name type="personal">
<namePart type="given">Shubham</namePart>
<namePart type="family">Toshniwal</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sam</namePart>
<namePart type="family">Wiseman</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Karen</namePart>
<namePart type="family">Livescu</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Kevin</namePart>
<namePart type="family">Gimpel</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2022-12</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Findings of the Association for Computational Linguistics: EMNLP 2022</title>
</titleInfo>
<name type="personal">
<namePart type="given">Yoav</namePart>
<namePart type="family">Goldberg</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Zornitsa</namePart>
<namePart type="family">Kozareva</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Yue</namePart>
<namePart type="family">Zhang</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Association for Computational Linguistics</publisher>
<place>
<placeTerm type="text">Abu Dhabi, United Arab Emirates</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>Neural language models have been analyzed for their linguistic and extra-linguistic knowledge via probing. Of particular interest has been the following question: how much can a language model trained only on form learn about meaning? Recent work has demonstrated via probing classifiers that in the setting of simple procedural text, where by “meaning” we mean the underlying world state, language models have a non-trivial performance on world state tracking. However, our proposed evaluation based on model predictions shows differing results, suggesting that these models are either not capturing the world state or not using it. How do these results change if the model has access to the world state? We explore this alternate setting with access to the underlying world state only during training and investigate ways of “baking in” the state knowledge along with the primary task of language modeling. Our proposed approaches allow for state probing during inference simply via text prompts, avoiding any probing classifier machinery. In terms of performance, we show that baking in the state knowledge during training leads to significant improvements in state tracking performance and text generation quality,</abstract>
<identifier type="citekey">toshniwal-etal-2022-baked</identifier>
<identifier type="doi">10.18653/v1/2022.findings-emnlp.397</identifier>
<location>
<url>https://aclanthology.org/2022.findings-emnlp.397</url>
</location>
<part>
<date>2022-12</date>
<extent unit="page">
<start>5430</start>
<end>5435</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Baked-in State Probing
%A Toshniwal, Shubham
%A Wiseman, Sam
%A Livescu, Karen
%A Gimpel, Kevin
%Y Goldberg, Yoav
%Y Kozareva, Zornitsa
%Y Zhang, Yue
%S Findings of the Association for Computational Linguistics: EMNLP 2022
%D 2022
%8 December
%I Association for Computational Linguistics
%C Abu Dhabi, United Arab Emirates
%F toshniwal-etal-2022-baked
%X Neural language models have been analyzed for their linguistic and extra-linguistic knowledge via probing. Of particular interest has been the following question: how much can a language model trained only on form learn about meaning? Recent work has demonstrated via probing classifiers that in the setting of simple procedural text, where by “meaning” we mean the underlying world state, language models have a non-trivial performance on world state tracking. However, our proposed evaluation based on model predictions shows differing results, suggesting that these models are either not capturing the world state or not using it. How do these results change if the model has access to the world state? We explore this alternate setting with access to the underlying world state only during training and investigate ways of “baking in” the state knowledge along with the primary task of language modeling. Our proposed approaches allow for state probing during inference simply via text prompts, avoiding any probing classifier machinery. In terms of performance, we show that baking in the state knowledge during training leads to significant improvements in state tracking performance and text generation quality,
%R 10.18653/v1/2022.findings-emnlp.397
%U https://aclanthology.org/2022.findings-emnlp.397
%U https://doi.org/10.18653/v1/2022.findings-emnlp.397
%P 5430-5435
Markdown (Informal)
[Baked-in State Probing](https://aclanthology.org/2022.findings-emnlp.397) (Toshniwal et al., Findings 2022)
ACL
- Shubham Toshniwal, Sam Wiseman, Karen Livescu, and Kevin Gimpel. 2022. Baked-in State Probing. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5430–5435, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.