Which Domains, Tasks and Languages are in the Focus of NLP Research on the Languages of Europe?

Diego Alves, Marko Tadić, Georg Rehm


Abstract
This article provides a thorough mapping of NLP and Language Technology research on 39 European languages onto 46 domains. Our analysis is based on almost 50,000 papers published between 2010 and October 2022 in the ACL Anthology. We use a dictionary-based approach to identify 1) languages, 2) domains, and 3) NLP tasks in these papers; the dictionary-based method using exact terms has a precision value of 0.81. Moreover, we identify common mistakes which can be useful to fine-tune the methodology for future work. While we are only able to highlight selected results in this submitted version, the final paper will contain detailed analyses and charts on a per-language basis. We hope that this study can contribute to digital language equality in Europe by providing information to the academic and industrial research community about the opportunities for novel LT/NLP research.
Anthology ID:
2024.tdle-1.2
Volume:
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Federico Gaspari, Joss Moorkens, Itziar Aldabe, Aritz Farwell, Begona Altuna, Stelios Piperidis, Georg Rehm, German Rigau
Venues:
TDLE | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
18–32
Language:
URL:
https://aclanthology.org/2024.tdle-1.2
DOI:
Bibkey:
Cite (ACL):
Diego Alves, Marko Tadić, and Georg Rehm. 2024. Which Domains, Tasks and Languages are in the Focus of NLP Research on the Languages of Europe?. In Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024, pages 18–32, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Which Domains, Tasks and Languages are in the Focus of NLP Research on the Languages of Europe? (Alves et al., TDLE-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tdle-1.2.pdf