Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents

Yufan Guo, Roi Reichart, Anna Korhonen


Abstract
Inferring the information structure of scientific documents is useful for many NLP applications. Existing approaches to this task require substantial human effort. We propose a framework for constraint learning that reduces human involvement considerably. Our model uses topic models to identify latent topics and their key linguistic features in input documents, induces constraints from this information and maps sentences to their dominant information structure categories through a constrained unsupervised model. When the induced constraints are combined with a fully unsupervised model, the resulting model challenges existing lightly supervised feature-based models as well as unsupervised models that use manually constructed declarative knowledge. Our results demonstrate that useful declarative knowledge can be learned from data with very limited human involvement.
Anthology ID:
Q15-1010
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Editors:
Michael Collins, Lillian Lee
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
131–143
Language:
URL:
https://aclanthology.org/Q15-1010/
DOI:
10.1162/tacl_a_00128
Bibkey:
Cite (ACL):
Yufan Guo, Roi Reichart, and Anna Korhonen. 2015. Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents. Transactions of the Association for Computational Linguistics, 3:131–143.
Cite (Informal):
Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents (Guo et al., TACL 2015)
Copy Citation:
PDF:
https://aclanthology.org/Q15-1010.pdf