Counterfactual Augmentation for Multimodal Learning Under Presentation Bias

Victoria Lin, Louis-Philippe Morency, Dimitrios Dimitriadis, Srinagesh Sharma


Abstract
In real-world machine learning systems, labels are often derived from user behaviors that the system wishes to encourage. Over time, new models must be trained as new training examples and features become available. However, feedback loops between users and models can bias future user behavior, inducing a *presentation bias* in the labels that compromises the ability to train new models. In this paper, we propose *counterfactual augmentation*, a novel causal method for correcting presentation bias using generated counterfactual labels. Our empirical evaluations demonstrate that counterfactual augmentation yields better downstream performance compared to both uncorrected models and existing bias-correction methods. Model analyses further indicate that the generated counterfactuals align closely with true counterfactuals in an oracle setting.
Anthology ID:
2023.findings-emnlp.43
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
592–606
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.43
DOI:
10.18653/v1/2023.findings-emnlp.43
Bibkey:
Cite (ACL):
Victoria Lin, Louis-Philippe Morency, Dimitrios Dimitriadis, and Srinagesh Sharma. 2023. Counterfactual Augmentation for Multimodal Learning Under Presentation Bias. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 592–606, Singapore. Association for Computational Linguistics.
Cite (Informal):
Counterfactual Augmentation for Multimodal Learning Under Presentation Bias (Lin et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.43.pdf