Enriching Multiword Terms in Wiktionary with Pronunciation Information

Lenka Bajcetic, Thierry Declerck, Gilles Sérasset


Abstract
We report on work in progress dealing with the automated generation of pronunciation information for English multiword terms (MWTs) in Wiktionary, combining information available for their single components. We describe the issues we were encountering, the building of an evaluation dataset, and our teaming with the DBnary resource maintainer. Our approach shows potential for automatically adding morphosyntactic and semantic information to the components of such MWTs.
Anthology ID:
2023.mwe-1.10
Volume:
Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Archna Bhatia, Kilian Evang, Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
65–72
Language:
URL:
https://aclanthology.org/2023.mwe-1.10
DOI:
10.18653/v1/2023.mwe-1.10
Bibkey:
Cite (ACL):
Lenka Bajcetic, Thierry Declerck, and Gilles Sérasset. 2023. Enriching Multiword Terms in Wiktionary with Pronunciation Information. In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), pages 65–72, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Enriching Multiword Terms in Wiktionary with Pronunciation Information (Bajcetic et al., MWE 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mwe-1.10.pdf
Video:
 https://aclanthology.org/2023.mwe-1.10.mp4