Optimal encoding! - Information Theory constrains article omission in newspaper headlines

Robin Lemke, Eva Horch, Ingo Reich


Abstract
In this paper we pursue the hypothesis that the distribution of article omission specifically is constrained by principles of Information Theory (Shannon 1948). In particular, Information Theory predicts a stronger preference for article omission before nouns which are relatively unpredictable in context of the preceding words. We investigated article omission in German newspaper headlines with a corpus and acceptability rating study. Both support our hypothesis: Articles are inserted more often before unpredictable nouns and subjects perceive article omission before predictable nouns as more well-formed than before unpredictable ones. This suggests that information theoretic principles constrain the distribution of article omission in headlines.
Anthology ID:
E17-2021
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
131–135
Language:
URL:
https://aclanthology.org/E17-2021
DOI:
Bibkey:
Cite (ACL):
Robin Lemke, Eva Horch, and Ingo Reich. 2017. Optimal encoding! - Information Theory constrains article omission in newspaper headlines. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 131–135, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Optimal encoding! - Information Theory constrains article omission in newspaper headlines (Lemke et al., EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-2021.pdf