Termout: a tool for the semi-automatic creation of term databases

Rogelio Nazar, Nicolas Acosta


Abstract
We propose a tool for the semi-automatic production of terminological databases, divided in the steps of corpus processing, terminology extraction, database population and management. With this tool it is possible to obtain a draft macrostructure (a lemma-list) and data for the microstructural level, such as grammatical (morphosyntactic patterns, gender, formation process) and semantic information (hypernyms, equivalence in another language, definitions and synonyms). In this paper we offer an overall description of the software and an evaluation of its performance, for which we used a linguistics corpus in English and Spanish.
Anthology ID:
2023.contents-1.2
Volume:
Proceedings of the Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTS) Incorporating the 16th Workshop on Building and Using Comparable Corpora (BUCC)
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Amal Haddad Haddad, Ayla Rigouts Terryn, Ruslan Mitkov, Reinhard Rapp, Pierre Zweigenbaum, Serge Sharoff
Venues:
ConTeNTS | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
9–18
Language:
URL:
https://aclanthology.org/2023.contents-1.2
DOI:
Bibkey:
Cite (ACL):
Rogelio Nazar and Nicolas Acosta. 2023. Termout: a tool for the semi-automatic creation of term databases. In Proceedings of the Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTS) Incorporating the 16th Workshop on Building and Using Comparable Corpora (BUCC), pages 9–18, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Termout: a tool for the semi-automatic creation of term databases (Nazar & Acosta, ConTeNTS-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.contents-1.2.pdf