Could Chemical Language Models benefit from Message Passing

Jiaqing Xie, Ziheng Chi


Abstract
Pretrained language models (LMs) showcase significant capabilities in processing molecular text, while concurrently, message passing neural networks (MPNNs) demonstrate resilience and versatility in the domain of molecular science. Despite these advancements, we find there are limited studies investigating the bidirectional interactions between molecular structures and their corresponding textual representations. Therefore, in this paper, we propose two strategies to evaluate whether an information integration can enhance the performance: contrast learning, which involves utilizing an MPNN to supervise the training of the LM, and fusion, which exploits information from both models. Our empirical analysis reveals that the integration approaches exhibit superior performance compared to baselines when applied to smaller molecular graphs, while these integration approaches do not yield performance enhancements on large scale graphs.
Anthology ID:
2024.langmol-1.2
Volume:
Proceedings of the 1st Workshop on Language + Molecules (L+M 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Carl Edwards, Qingyun Wang, Manling Li, Lawrence Zhao, Tom Hope, Heng Ji
Venues:
LangMol | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10–20
Language:
URL:
https://aclanthology.org/2024.langmol-1.2
DOI:
Bibkey:
Cite (ACL):
Jiaqing Xie and Ziheng Chi. 2024. Could Chemical Language Models benefit from Message Passing. In Proceedings of the 1st Workshop on Language + Molecules (L+M 2024), pages 10–20, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Could Chemical Language Models benefit from Message Passing (Xie & Chi, LangMol-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.langmol-1.2.pdf