Selective Annotation of Sentence Parts: Identification of Relevant Sub-sentential Units

Ge Xu, Xiaoyan Yang, Chu-Ren Huang


Abstract
Many NLP tasks involve sentence-level annotation yet the relevant information is not encoded at sentence level but at some relevant parts of the sentence. Such tasks include but are not limited to: sentiment expression annotation, product feature annotation, and template annotation for Q&A systems. However, annotation of the full corpus sentence by sentence is resource intensive. In this paper, we propose an approach that iteratively extracts frequent parts of sentences for annotating, and compresses the set of sentences after each round of annotation. Our approach can also be used in preparing training sentences for binary classification (domain-related vs. noise, subjectivity vs. objectivity, etc.), assuming that sentence-type annotation can be predicted by annotation of the most relevant sub-sentences. Two experiments are performed to test our proposal and evaluated in terms of time saved and agreement of annotation.
Anthology ID:
W16-5411
Volume:
Proceedings of the 12th Workshop on Asian Language Resources (ALR12)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Koiti Hasida, Kam-Fai Wong, Nicoletta Calzorari, Key-Sun Choi
Venue:
ALR
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
86–94
Language:
URL:
https://aclanthology.org/W16-5411
DOI:
Bibkey:
Cite (ACL):
Ge Xu, Xiaoyan Yang, and Chu-Ren Huang. 2016. Selective Annotation of Sentence Parts: Identification of Relevant Sub-sentential Units. In Proceedings of the 12th Workshop on Asian Language Resources (ALR12), pages 86–94, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Selective Annotation of Sentence Parts: Identification of Relevant Sub-sentential Units (Xu et al., ALR 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-5411.pdf