CrystalICL: Enabling In-Context Learning for Crystal Generation

Ruobing Wang; Qiaoyu Tan; Yili Wang; Ying Wang; Xin Wang

doi:10.18653/v1/2025.emnlp-main.929

CrystalICL: Enabling In-Context Learning for Crystal Generation

Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, Xin Wang

Abstract

Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.

Anthology ID:: 2025.emnlp-main.929
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18429–18444
Language:
URL:: https://aclanthology.org/2025.emnlp-main.929/
DOI:: 10.18653/v1/2025.emnlp-main.929
Bibkey:
Cite (ACL):: Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, and Xin Wang. 2025. CrystalICL: Enabling In-Context Learning for Crystal Generation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 18429–18444, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: CrystalICL: Enabling In-Context Learning for Crystal Generation (Wang et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.929.pdf
Checklist:: 2025.emnlp-main.929.checklist.pdf

PDF Cite Search Checklist Fix data