Detecting Contact-Induced Semantic Shifts: What Can Embedding-Based Methods Do in Practice?

Filip Miletic, Anne Przewozny-Desriaux, Ludovic Tanguy


Abstract
This study investigates the applicability of semantic change detection methods in descriptively oriented linguistic research. It specifically focuses on contact-induced semantic shifts in Quebec English. We contrast synchronic data from different regions in order to identify the meanings that are specific to Quebec and potentially related to language contact. Type-level embeddings are used to detect new semantic shifts, and token-level embeddings to isolate regionally specific occurrences. We introduce a new 80-item test set and conduct both quantitative and qualitative evaluations. We demonstrate that diachronic word embedding methods can be applied to contact-induced semantic shifts observed in synchrony, obtaining results comparable to the state of the art on similar tasks in diachrony. However, we show that encouraging evaluation results do not translate to practical value in detecting new semantic shifts. Finally, our application of token-level embeddings accelerates manual data exploration and provides an efficient way of scaling up sociolinguistic analyses.
Anthology ID:
2021.emnlp-main.847
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10852–10865
Language:
URL:
https://aclanthology.org/2021.emnlp-main.847
DOI:
10.18653/v1/2021.emnlp-main.847
Bibkey:
Cite (ACL):
Filip Miletic, Anne Przewozny-Desriaux, and Ludovic Tanguy. 2021. Detecting Contact-Induced Semantic Shifts: What Can Embedding-Based Methods Do in Practice?. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10852–10865, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Detecting Contact-Induced Semantic Shifts: What Can Embedding-Based Methods Do in Practice? (Miletic et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.847.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.847.mp4