A large-scale study of the effects of word frequency and predictability in naturalistic reading

Cory Shain


Abstract
A number of psycholinguistic studies have factorially manipulated words’ contextual predictabilities and corpus frequencies and shown separable effects of each on measures of human sentence processing, a pattern which has been used to support distinct mechanisms underlying prediction on the one hand and lexical retrieval on the other. This paper examines the generalizability of this finding to more realistic conditions of sentence processing by studying effects of frequency and predictability in three large-scale naturalistic reading corpora. Results show significant effects of word frequency and predictability in isolation but no effect of frequency over and above predictability, and thus do not provide evidence of distinct mechanisms. The non-replication of separable effects in a naturalistic setting raises doubts about the existence of such a distinction in everyday sentence comprehension. Instead, these results are consistent with previous claims that apparent effects of frequency are underlyingly effects of predictability.
Anthology ID:
N19-1413
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4086–4094
Language:
URL:
https://aclanthology.org/N19-1413
DOI:
10.18653/v1/N19-1413
Bibkey:
Cite (ACL):
Cory Shain. 2019. A large-scale study of the effects of word frequency and predictability in naturalistic reading. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4086–4094, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
A large-scale study of the effects of word frequency and predictability in naturalistic reading (Shain, NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1413.pdf
Video:
 https://vimeo.com/359727110
Code
 coryshain/dtsr
Data
Natural Stories