Using a machine learning model to assess the complexity of stress systems

Liviu Dinu, Alina Maria Ciobanu, Ioana Chitoran, Vlad Niculae


Abstract
We address the task of stress prediction as a sequence tagging problem. We present sequential models with averaged perceptron training for learning primary stress in Romanian words. We use character n-grams and syllable n-grams as features and we account for the consonant-vowel structure of the words. We show in this paper that Romanian stress is predictable, though not deterministic, by using data-driven machine learning techniques.
Anthology ID:
L14-1140
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
331–336
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1200_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Liviu Dinu, Alina Maria Ciobanu, Ioana Chitoran, and Vlad Niculae. 2014. Using a machine learning model to assess the complexity of stress systems. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 331–336, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Using a machine learning model to assess the complexity of stress systems (Dinu et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1200_Paper.pdf