Deferred NAM: Low-latency Top-K Context Injection via Deferred Context Encoding for Non-Streaming ASR Zelin Wu author Gan Song author Christopher Li author Pat Rondon author Zhong Meng author Xavier Velez author Weiran Wang author Diamantino Caseiro author Golan Pundak author Tsendsuren Munkhdalai author Angad Chandorkar author Rohit Prabhavalkar author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track) Yi Yang editor Aida Davani editor Avi Sil editor Anoop Kumar editor Association for Computational Linguistics Mexico City, Mexico conference publication wu-etal-2024-deferred 10.18653/v1/2024.naacl-industry.26 https://aclanthology.org/2024.naacl-industry.26/ 2024-06 315 323