Comparing Test Sets with Item Response Theory Clara Vania author Phu Mon Htut author William Huang author Dhara Mungra author Richard Yuanzhe Pang author Jason Phang author Haokun Liu author Kyunghyun Cho author Samuel R Bowman author 2021-08 text Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Chengqing Zong editor Fei Xia editor Wenjie Li editor Roberto Navigli editor Association for Computational Linguistics Online conference publication vania-etal-2021-comparing 10.18653/v1/2021.acl-long.92 https://aclanthology.org/2021.acl-long.92/ 2021-08 1141 1158