Counterfactual Inference for Text Classification Debiasing

Chen Qian, Fuli Feng, Lijie Wen, Chunping Ma, Pengjun Xie


Abstract
Today’s text classifiers inevitably suffer from unintended dataset biases, especially the document-level label bias and word-level keyword bias, which may hurt models’ generalization. Many previous studies employed data-level manipulations or model-level balancing mechanisms to recover unbiased distributions and thus prevent models from capturing the two types of biases. Unfortunately, they either suffer from the extra cost of data collection/selection/annotation or need an elaborate design of balancing strategies. Different from traditional factual inference in which debiasing occurs before or during training, counterfactual inference mitigates the influence brought by unintended confounders after training, which can make unbiased decisions with biased observations. Inspired by this, we propose a model-agnostic text classification debiasing framework – Corsair, which can effectively avoid employing data manipulations or designing balancing mechanisms. Concretely, Corsair first trains a base model on a training set directly, allowing the dataset biases ‘poison’ the trained model. In inference, given a factual input document, Corsair imagines its two counterfactual counterparts to distill and mitigate the two biases captured by the poisonous model. Extensive experiments demonstrate Corsair’s effectiveness, generalizability and fairness.
Anthology ID:
2021.acl-long.422
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5434–5445
Language:
URL:
https://aclanthology.org/2021.acl-long.422
DOI:
10.18653/v1/2021.acl-long.422
Bibkey:
Cite (ACL):
Chen Qian, Fuli Feng, Lijie Wen, Chunping Ma, and Pengjun Xie. 2021. Counterfactual Inference for Text Classification Debiasing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5434–5445, Online. Association for Computational Linguistics.
Cite (Informal):
Counterfactual Inference for Text Classification Debiasing (Qian et al., ACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-long.422.pdf
Video:
 https://aclanthology.org/2021.acl-long.422.mp4
Code
 qianc62/corsair