Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control Xiang Fan author Yiwei Lyu author Paul Pu Liang author Ruslan Salakhutdinov author Louis-Philippe Morency author 2023-07 text Findings of the Association for Computational Linguistics: ACL 2023 Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication fan-etal-2023-nano 10.18653/v1/2023.findings-acl.758 https://aclanthology.org/2023.findings-acl.758/ 2023-07 11970 11992