Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model Zhiwei He author Xing Wang author Wenxiang Jiao author Zhuosheng Zhang author Rui Wang author Shuming Shi author Zhaopeng Tu author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication he-etal-2024-improving 10.18653/v1/2024.naacl-long.451 https://aclanthology.org/2024.naacl-long.451/ 2024-06 8164 8180