CrystalICL: Enabling In-Context Learning for Crystal Generation

Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, Xin Wang


Abstract
Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.
Anthology ID:
2025.emnlp-main.929
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18440–18455
Language:
URL:
https://aclanthology.org/2025.emnlp-main.929/
DOI:
Bibkey:
Cite (ACL):
Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, and Xin Wang. 2025. CrystalICL: Enabling In-Context Learning for Crystal Generation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 18440–18455, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
CrystalICL: Enabling In-Context Learning for Crystal Generation (Wang et al., EMNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.emnlp-main.929.pdf
Checklist:
 2025.emnlp-main.929.checklist.pdf