Word Discriminations for Vocabulary Inventory Prediction

Frankie Robertson


Abstract
The aim of vocabulary inventory prediction is to predict a learner’s whole vocabulary based on a limited sample of query words. This paper approaches the problem starting from the 2-parameter Item Response Theory (IRT) model, giving each word in the vocabulary a difficulty and discrimination parameter. The discrimination parameter is evaluated on the sub-problem of question item selection, familiar from the fields of Computerised Adaptive Testing (CAT) and active learning. Next, the effect of the discrimination parameter on prediction performance is examined, both in a binary classification setting, and in an information retrieval setting. Performance is compared with baselines based on word frequency. A number of different generalisation scenarios are examined, including generalising word difficulty and discrimination using word embeddings with a predictor network and testing on out-of-dataset data.
Anthology ID:
2021.ranlp-1.134
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1188–1195
Language:
URL:
https://aclanthology.org/2021.ranlp-1.134
DOI:
Bibkey:
Cite (ACL):
Frankie Robertson. 2021. Word Discriminations for Vocabulary Inventory Prediction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1188–1195, Held Online. INCOMA Ltd..
Cite (Informal):
Word Discriminations for Vocabulary Inventory Prediction (Robertson, RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-1.134.pdf
Code
 frankier/vocabirt