Predicting Degrees of Technicality in Automatic Terminology Extraction

Anna Hätty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde


Abstract
While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on technicality prediction. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparative embeddings. We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multi-channel model performing best.
Anthology ID:
2020.acl-main.258
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2883–2889
Language:
URL:
https://aclanthology.org/2020.acl-main.258
DOI:
10.18653/v1/2020.acl-main.258
Bibkey:
Cite (ACL):
Anna Hätty, Dominik Schlechtweg, Michael Dorna, and Sabine Schulte im Walde. 2020. Predicting Degrees of Technicality in Automatic Terminology Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2883–2889, Online. Association for Computational Linguistics.
Cite (Informal):
Predicting Degrees of Technicality in Automatic Terminology Extraction (Hätty et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.258.pdf
Video:
 http://slideslive.com/38928698