Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding Weilin Zhao author Yuxiang Huang author Xu Han author Wang Xu author Chaojun Xiao author Xinrong Zhang author Yewei Fang author Kaihuo Zhang author Zhiyuan Liu author Maosong Sun author 2024-11 text Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication zhao-etal-2024-ouroboros 10.18653/v1/2024.emnlp-main.742 https://aclanthology.org/2024.emnlp-main.742/ 2024-11 13378 13393