Automatic Generation of Vocabulary Lists with Multiword Expressions

John Lee, Adilet Uvaliyev


Abstract
The importance of multiword expressions (MWEs) for language learning is well established. While MWE research has been evaluated on various downstream tasks such as syntactic parsing and machine translation, its applications in computer-assisted language learning has been less explored. This paper investigates the selection of MWEs for graded vocabulary lists. Widely used by language teachers and students, these lists recommend a language acquisition sequence to optimize learning efficiency. We automatically generate these lists using difficulty-graded corpora and MWEs extracted based on semantic compositionality. We evaluate these lists on their ability to facilitate text comprehension for learners. Experimental results show that our proposed method generates higher-quality lists than baselines using collocation measures.
Anthology ID:
2023.mwe-1.12
Volume:
Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Archna Bhatia, Kilian Evang, Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
81–86
Language:
URL:
https://aclanthology.org/2023.mwe-1.12
DOI:
10.18653/v1/2023.mwe-1.12
Bibkey:
Cite (ACL):
John Lee and Adilet Uvaliyev. 2023. Automatic Generation of Vocabulary Lists with Multiword Expressions. In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), pages 81–86, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Automatic Generation of Vocabulary Lists with Multiword Expressions (Lee & Uvaliyev, MWE 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mwe-1.12.pdf
Video:
 https://aclanthology.org/2023.mwe-1.12.mp4