Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis

Shrey Desai, Barea Sinno, Alex Rosenfeld, Junyi Jessy Li


Abstract
Insightful findings in political science often require researchers to analyze documents of a certain subject or type, yet these documents are usually contained in large corpora that do not distinguish between pertinent and non-pertinent documents. In contrast, we can find corpora that label relevant documents but have limitations (e.g., from a single source or era), preventing their use for political science research. To bridge this gap, we present adaptive ensembling, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well with diachronic corpora. Experiments on an expert-annotated dataset show that our framework outperforms strong benchmarks. Further analysis indicates that our methods are more stable, learn better representations, and extract cleaner corpora for fine-grained analysis.
Anthology ID:
D19-1478
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4718–4730
Language:
URL:
https://aclanthology.org/D19-1478
DOI:
10.18653/v1/D19-1478
Bibkey:
Cite (ACL):
Shrey Desai, Barea Sinno, Alex Rosenfeld, and Junyi Jessy Li. 2019. Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4718–4730, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis (Desai et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-1478.pdf
Attachment:
 D19-1478.Attachment.pdf
Code
 shreydesai/adaptive-ensembling
Data
New York Times Annotated Corpus