Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study John P Lalor author Hao Wu author Tsendsuren Munkhdalai author Hong Yu author 2018-oct-nov text Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Ellen Riloff editor David Chiang editor Julia Hockenmaier editor Jun’ichi Tsujii editor Association for Computational Linguistics Brussels, Belgium conference publication lalor-etal-2018-understanding 10.18653/v1/D18-1500 https://aclanthology.org/D18-1500/ 2018-oct-nov 4711 4716