RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs Afra Feyza Akyurek author Ekin Akyurek author Ashwin Kalyan author Peter Clark author Derry Tanti Wijaya author Niket Tandon author 2023-07 text Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication akyurek-etal-2023-rl4f 10.18653/v1/2023.acl-long.427 https://aclanthology.org/2023.acl-long.427/ 2023-07 7716 7733