Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference

Yuxia Wang, Minghan Wang, Yimeng Chen, Shimin Tao, Jiaxin Guo, Chang Su, Min Zhang, Hao Yang


Abstract
Natural Language Inference (NLI) datasets contain examples with highly ambiguous labels due to its subjectivity. Several recent efforts have been made to acknowledge and embrace the existence of ambiguity, and explore how to capture the human disagreement distribution. In contrast with directly learning from gold ambiguity labels, relying on special resource, we argue that the model has naturally captured the human ambiguity distribution as long as it’s calibrated, i.e. the predictive probability can reflect the true correctness likelihood. Our experiments show that when model is well-calibrated, either by label smoothing or temperature scaling, it can obtain competitive performance as prior work, on both divergence scores between predictive probability and the true human opinion distribution, and the accuracy. This reveals the overhead of collecting gold ambiguity labels can be cut, by broadly solving how to calibrate the NLI network.
Anthology ID:
2022.findings-acl.120
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1524–1535
Language:
URL:
https://aclanthology.org/2022.findings-acl.120
DOI:
10.18653/v1/2022.findings-acl.120
Bibkey:
Cite (ACL):
Yuxia Wang, Minghan Wang, Yimeng Chen, Shimin Tao, Jiaxin Guo, Chang Su, Min Zhang, and Hao Yang. 2022. Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1524–1535, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference (Wang et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.120.pdf
Data
ChaosNLIMultiNLI