Sprout: Green Generative AI with Carbon-Efficient LLM Inference

Baolin Li, Yankai Jiang, Vijay Gadepally, Devesh Tiwari


Abstract
The rapid advancement of generative AI has heightened environmental concerns, particularly regarding carbon emissions. Our framework, Sprout, addresses these challenges by reducing the carbon footprint of inference in large language models (LLMs). Sprout introduces “generation directives” to guide the autoregressive generation process, achieving a balance between ecological sustainability and high-quality outputs. By employing a strategic optimizer for directive assignment and a novel offline quality evaluator, Sprout reduces the carbon footprint of generative LLM inference by over 40% in real-world evaluations, using the Llama model and global electricity grid data. This work is crucial as the rising interest in inference time compute scaling laws amplifies environmental concerns, emphasizing the need for eco-friendly AI solutions.
Anthology ID:
2024.emnlp-main.1215
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21799–21813
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1215
DOI:
Bibkey:
Cite (ACL):
Baolin Li, Yankai Jiang, Vijay Gadepally, and Devesh Tiwari. 2024. Sprout: Green Generative AI with Carbon-Efficient LLM Inference. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21799–21813, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Sprout: Green Generative AI with Carbon-Efficient LLM Inference (Li et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1215.pdf