Yankai Jiang


2024

pdf bib
Sprout: Green Generative AI with Carbon-Efficient LLM Inference
Baolin Li | Yankai Jiang | Vijay Gadepally | Devesh Tiwari
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The rapid advancement of generative AI has heightened environmental concerns, particularly regarding carbon emissions. Our framework, Sprout, addresses these challenges by reducing the carbon footprint of inference in large language models (LLMs). Sprout introduces “generation directives” to guide the autoregressive generation process, achieving a balance between ecological sustainability and high-quality outputs. By employing a strategic optimizer for directive assignment and a novel offline quality evaluator, Sprout reduces the carbon footprint of generative LLM inference by over 40% in real-world evaluations, using the Llama model and global electricity grid data. This work is crucial as the rising interest in inference time compute scaling laws amplifies environmental concerns, emphasizing the need for eco-friendly AI solutions.