Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation

Joseph Marvin Imperial; Gail Forey; Harish Tayyar Madabushi

doi:10.18653/v1/2024.emnlp-main.94

Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation

Joseph Marvin Imperial, Gail Forey, Harish Tayyar Madabushi

Abstract

Domain experts across engineering, healthcare, and education follow strict standards for producing quality content such as technical manuals, medication instructions, and children’s reading materials. However, current works in controllable text generation have yet to explore using these standards as references for control. Towards this end, we introduce Standardize, a retrieval-style in-context learning-based framework to guide large language models to align with expert-defined standards. Focusing on English language standards in the education domain as a use case, we consider the Common European Framework of Reference for Languages (CEFR) and Common Core Standards (CCS) for the task of open-ended content generation. Our findings show that models can gain 45% to 100% increase in precise accuracy across open and commercial LLMs evaluated, demonstrating that the use of knowledge artifacts extracted from standards and integrating them in the generation process can effectively guide models to produce better standard-aligned content.

Anthology ID:: 2024.emnlp-main.94
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1573–1594
Language:
URL:: https://aclanthology.org/2024.emnlp-main.94/
DOI:: 10.18653/v1/2024.emnlp-main.94
Bibkey:
Cite (ACL):: Joseph Marvin Imperial, Gail Forey, and Harish Tayyar Madabushi. 2024. Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1573–1594, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Standardize: Aligning Language Models with Expert-Defined Standards for Content Generation (Imperial et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-main.94.pdf
Software:: 2024.emnlp-main.94.software.zip
Data:: 2024.emnlp-main.94.data.zip

PDF Cite Search Software Data Fix data