A phoneme clustering algorithm based on the obligatory contour principle

Mans Hulden


Abstract
This paper explores a divisive hierarchical clustering algorithm based on the well-known Obligatory Contour Principle in phonology. The purpose is twofold: to see if such an algorithm could be used for unsupervised classification of phonemes or graphemes in corpora, and to investigate whether this purported universal constraint really holds for several classes of phonological distinctive features. The algorithm achieves very high accuracies in an unsupervised setting of inferring a consonant-vowel distinction, and also has a strong tendency to detect coronal phonemes in an unsupervised fashion. Remaining classes, however, do not correspond as neatly to phonological distinctive feature splits. While the results offer only mixed support for a universal Obligatory Contour Principle, the algorithm can be very useful for many NLP tasks due to the high accuracy in revealing consonant/vowel/coronal distinctions.
Anthology ID:
K17-1030
Volume:
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Roger Levy, Lucia Specia
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
290–300
Language:
URL:
https://aclanthology.org/K17-1030/
DOI:
10.18653/v1/K17-1030
Bibkey:
Cite (ACL):
Mans Hulden. 2017. A phoneme clustering algorithm based on the obligatory contour principle. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 290–300, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
A phoneme clustering algorithm based on the obligatory contour principle (Hulden, CoNLL 2017)
Copy Citation:
PDF:
https://aclanthology.org/K17-1030.pdf
Code
 cvocp/cvocp