Adapting and evaluating a generic term extraction tool

Anita Gojun, Ulrich Heid, Bernd Weißbach, Carola Loth, Insa Mingers


Abstract
We present techniques for monolingual term candidate extraction which are being developed in the EU project TTC. We designed an application for German and English data that serves as a first evaluation of the methods for terminology extraction used in the project. The application situation highlighted the need for tools to handle lemmatization errors and to remove incomplete word sequences from multi-word term candidate lists, as well as the fact that the provision of German citation forms requires more morphological knowledge than TTC's slim approach can provide. We show a detailed evaluation of our extraction results and discuss the method for the evaluation of terminology extraction systems.
Anthology ID:
L12-1436
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
651–656
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/746_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Anita Gojun, Ulrich Heid, Bernd Weißbach, Carola Loth, and Insa Mingers. 2012. Adapting and evaluating a generic term extraction tool. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 651–656, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Adapting and evaluating a generic term extraction tool (Gojun et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/746_Paper.pdf