Uniform Information Density Effects on Syntactic Choice in Hindi

Ayush Jain, Vishal Singh, Sidharth Ranjan, Rajakrishnan Rajkumar, Sumeet Agarwal


Abstract
According to the UNIFORM INFORMATION DENSITY (UID) hypothesis (Levy and Jaeger, 2007; Jaeger, 2010), speakers tend to distribute information density across the signal uniformly while producing language. The prior works cited above studied syntactic reduction in language production at particular choice points in a sentence. In contrast, we use a variant of the above UID hypothesis in order to investigate the extent to which word order choices in Hindi are influenced by the drive to minimize the variance of information across entire sentences. To this end, we propose multiple lexical and syntactic measures (at both word and constituent levels) to capture the uniform spread of information across a sentence. Subsequently, we incorporate these measures in machine learning models aimed to distinguish between a naturally occurring corpus sentence and its grammatical variants (expressing the same idea). Our results indicate that our UID measures are not a significant factor in predicting the corpus sentence in the presence of lexical surprisal, a competing control predictor. Finally, in the light of other recent works, we conclude with a discussion of reasons for UID not being suitable for a theory of word order.
Anthology ID:
W18-4605
Volume:
Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing
Month:
August
Year:
2018
Address:
Santa Fe, New-Mexico
Editors:
Leonor Becerra-Bonache, M. Dolores Jiménez-López, Carlos Martín-Vide, Adrià Torrens-Urrutia
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
38–48
Language:
URL:
https://aclanthology.org/W18-4605
DOI:
Bibkey:
Cite (ACL):
Ayush Jain, Vishal Singh, Sidharth Ranjan, Rajakrishnan Rajkumar, and Sumeet Agarwal. 2018. Uniform Information Density Effects on Syntactic Choice in Hindi. In Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing, pages 38–48, Santa Fe, New-Mexico. Association for Computational Linguistics.
Cite (Informal):
Uniform Information Density Effects on Syntactic Choice in Hindi (Jain et al., 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4605.pdf