Taylor’s law for Human Linguistic Sequences

Tatsuru Kobayashi, Kumiko Tanaka-Ishii


Abstract
Taylor’s law describes the fluctuation characteristics underlying a system in which the variance of an event within a time span grows by a power law with respect to the mean. Although Taylor’s law has been applied in many natural and social systems, its application for language has been scarce. This article describes a new way to quantify Taylor’s law in natural language and conducts Taylor analysis of over 1100 texts across 14 languages. We found that the Taylor exponents of natural language written texts exhibit almost the same value. The exponent was also compared for other language-related data, such as the child-directed speech, music, and programming languages. The results show how the Taylor exponent serves to quantify the fundamental structural complexity underlying linguistic time series. The article also shows the applicability of these findings in evaluating language models.
Anthology ID:
P18-1105
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1138–1148
Language:
URL:
https://aclanthology.org/P18-1105
DOI:
10.18653/v1/P18-1105
Bibkey:
Cite (ACL):
Tatsuru Kobayashi and Kumiko Tanaka-Ishii. 2018. Taylor’s law for Human Linguistic Sequences. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1138–1148, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Taylor’s law for Human Linguistic Sequences (Kobayashi & Tanaka-Ishii, ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-1105.pdf
Note:
 P18-1105.Notes.pdf
Presentation:
 P18-1105.Presentation.pdf
Video:
 https://aclanthology.org/P18-1105.mp4
Code
 Group-TanakaIshii/word_taylor