Mining the UK Web Archive for Semantic Change Detection

Adam Tsakalidis, Marya Bazzi, Mihai Cucuringu, Pierpaolo Basile, Barbara McGillivray


Abstract
Semantic change detection (i.e., identifying words whose meaning has changed over time) started emerging as a growing area of research over the past decade, with important downstream applications in natural language processing, historical linguistics and computational social science. However, several obstacles make progress in the domain slow and difficult. These pertain primarily to the lack of well-established gold standard datasets, resources to study the problem at a fine-grained temporal resolution, and quantitative evaluation approaches. In this work, we aim to mitigate these issues by (a) releasing a new labelled dataset of more than 47K word vectors trained on the UK Web Archive over a short time-frame (2000-2013); (b) proposing a variant of Procrustes alignment to detect words that have undergone semantic shift; and (c) introducing a rank-based approach for evaluation purposes. Through extensive numerical experiments and validation, we illustrate the effectiveness of our approach against competitive baselines. Finally, we also make our resources publicly available to further enable research in the domain.
Anthology ID:
R19-1139
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1212–1221
Language:
URL:
https://aclanthology.org/R19-1139
DOI:
10.26615/978-954-452-056-4_139
Bibkey:
Cite (ACL):
Adam Tsakalidis, Marya Bazzi, Mihai Cucuringu, Pierpaolo Basile, and Barbara McGillivray. 2019. Mining the UK Web Archive for Semantic Change Detection. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1212–1221, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Mining the UK Web Archive for Semantic Change Detection (Tsakalidis et al., RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/R19-1139.pdf