Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling

Junda Wu, Rui Wang, Tong Yu, Ruiyi Zhang, Handong Zhao, Shuai Li, Ricardo Henao, Ani Nenkova


Abstract
Supervised training of existing deep learning models for sequence labeling relies on large scale labeled datasets. Such datasets are generally created with crowd-source labeling. However, crowd-source labeling for tasks of sequence labeling can be expensive and time-consuming. Further, crowd-source labeling by external annotators may not be appropriate for data that contains user private information. Considering the above limitations of crowd-source labeling, we study interactive sequence labeling that allows training directly with the user feedback, which alleviates the annotation cost and maintains the user privacy. We identify two bias, namely, context bias and feedback bias, by formulating interactive sequence labeling via a Structural Causal Model (SCM). To alleviate the context and feedback bias based on the SCM, we identify the frequent context tokens as confounders in the backdoor adjustment and further propose an entropy-based modulation that is inspired by information theory. entities more sample-efficiently. With extensive experiments, we validate that our approach can effectively alleviate the biases and our models can be efficiently learnt with the user feedback.
Anthology ID:
2022.findings-emnlp.251
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3436–3448
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.251
DOI:
10.18653/v1/2022.findings-emnlp.251
Bibkey:
Cite (ACL):
Junda Wu, Rui Wang, Tong Yu, Ruiyi Zhang, Handong Zhao, Shuai Li, Ricardo Henao, and Ani Nenkova. 2022. Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3436–3448, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling (Wu et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.251.pdf
Video:
 https://aclanthology.org/2022.findings-emnlp.251.mp4