Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars

Benjamin Börschinger, Mark Johnson


Abstract
Stress has long been established as a major cue in word segmentation for English infants. We show that enabling a current state-of-the-art Bayesian word segmentation model to take advantage of stress cues noticeably improves its performance. We find that the improvements range from 10 to 4%, depending on both the use of phonotactic cues and, to a lesser extent, the amount of evidence available to the learner. We also find that in particular early on, stress cues are much more useful for our model than phonotactic cues by themselves, consistent with the finding that children do seem to use stress cues before they use phonotactic cues. Finally, we study how the model’s knowledge about stress patterns evolves over time. We not only find that our model correctly acquires the most frequent patterns relatively quickly but also that the Unique Stress Constraint that is at the heart of a previously proposed model does not need to be built in but can be acquired jointly with word segmentation.
Anthology ID:
Q14-1008
Volume:
Transactions of the Association for Computational Linguistics, Volume 2
Month:
Year:
2014
Address:
Cambridge, MA
Editors:
Dekang Lin, Michael Collins, Lillian Lee
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
93–104
Language:
URL:
https://aclanthology.org/Q14-1008/
DOI:
10.1162/tacl_a_00168
Bibkey:
Cite (ACL):
Benjamin Börschinger and Mark Johnson. 2014. Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars. Transactions of the Association for Computational Linguistics, 2:93–104.
Cite (Informal):
Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars (Börschinger & Johnson, TACL 2014)
Copy Citation:
PDF:
https://aclanthology.org/Q14-1008.pdf
Video:
 https://aclanthology.org/Q14-1008.mp4