A Bag-of-concepts Model Improves Relation Extraction in a Narrow Knowledge Domain with Limited Data

Jiyu Chen, Karin Verspoor, Zenan Zhai


Abstract
This paper focuses on a traditional relation extraction task in the context of limited annotated data and a narrow knowledge domain. We explore this task with a clinical corpus consisting of 200 breast cancer follow-up treatment letters in which 16 distinct types of relations are annotated. We experiment with an approach to extracting typed relations called window-bounded co-occurrence (WBC), which uses an adjustable context window around entity mentions of a relevant type, and compare its performance with a more typical intra-sentential co-occurrence baseline. We further introduce a new bag-of-concepts (BoC) approach to feature engineering based on the state-of-the-art word embeddings and word synonyms. We demonstrate the competitiveness of BoC by comparing with methods of higher complexity, and explore its effectiveness on this small dataset.
Anthology ID:
N19-3007
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Sudipta Kar, Farah Nadeem, Laura Burdick, Greg Durrett, Na-Rae Han
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
43–52
Language:
URL:
https://aclanthology.org/N19-3007
DOI:
10.18653/v1/N19-3007
Bibkey:
Cite (ACL):
Jiyu Chen, Karin Verspoor, and Zenan Zhai. 2019. A Bag-of-concepts Model Improves Relation Extraction in a Narrow Knowledge Domain with Limited Data. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 43–52, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
A Bag-of-concepts Model Improves Relation Extraction in a Narrow Knowledge Domain with Limited Data (Chen et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-3007.pdf