What do phone embeddings learn about Phonology?

Sudheer Kolachina, Lilla Magyar


Abstract
Recent work has looked at evaluation of phone embeddings using sound analogies and correlations between distinctive feature space and embedding space. It has not been clear what aspects of natural language phonology are learnt by neural network inspired distributed representational models such as word2vec. To study the kinds of phonological relationships learnt by phone embeddings, we present artificial phonology experiments that show that phone embeddings learn paradigmatic relationships such as phonemic and allophonic distribution quite well. They are also able to capture co-occurrence restrictions among vowels such as those observed in languages with vowel harmony. However, they are unable to learn co-occurrence restrictions among the class of consonants.
Anthology ID:
W19-4219
Volume:
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Garrett Nicolai, Ryan Cotterell
Venue:
ACL
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
160–169
Language:
URL:
https://aclanthology.org/W19-4219
DOI:
10.18653/v1/W19-4219
Bibkey:
Cite (ACL):
Sudheer Kolachina and Lilla Magyar. 2019. What do phone embeddings learn about Phonology?. In Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 160–169, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
What do phone embeddings learn about Phonology? (Kolachina & Magyar, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4219.pdf