A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context

Fahime Same, Kees van Deemter


Abstract
This paper reports on a structured evaluation of feature-based Machine Learning algorithms for selecting the form of a referring expression in discourse context. Based on this evaluation, we selected seven feature sets from the literature, amounting to 65 distinct linguistic features. The features were then grouped into 9 broad classes. After building Random Forest models, we used Feature Importance Ranking and Sequential Forward Search methods to assess the “importance” of the features. Combining the results of the two methods, we propose a consensus feature set. The 6 features in our consensus set come from 4 different classes, namely grammatical role, inherent features of the referent, antecedent form and recency.
Anthology ID:
2020.coling-main.403
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4575–4586
Language:
URL:
https://aclanthology.org/2020.coling-main.403
DOI:
10.18653/v1/2020.coling-main.403
Bibkey:
Cite (ACL):
Fahime Same and Kees van Deemter. 2020. A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4575–4586, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context (Same & van Deemter, COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.403.pdf
Data
OntoNotes 5.0