NLPeople at L+M-24 Shared Task: An Ensembled Approach for Molecule Captioning from SMILES

Shinnosuke Tanaka, Carol Mak, Flaviu Cipcigan, James Barry, Mohab Elkaref, Movina Moses, Vishnudev Kuruvanthodi, Geeth Mel


Abstract
This paper presents our approach submitted to the Language + Molecules 2024 (L+M-24) Shared Task in the Molecular Captioning track. The task involves generating captions that describe the properties of molecules that are provided in SMILES format.We propose a method for the task that decomposes the challenge of generating captions from SMILES into a classification problem,where we first predict the molecule’s properties. The molecules whose properties can be predicted with high accuracy show high translation metric scores in the caption generation by LLMs, while others produce low scores. Then we use the predicted properties to select the captions generated by different types of LLMs, and use that prediction as the final output. Our submission achieved an overall increase score of 15.21 on the dev set and 12.30 on the evaluation set, based on translation metrics and property metrics from the baseline.
Anthology ID:
2024.langmol-1.10
Volume:
Proceedings of the 1st Workshop on Language + Molecules (L+M 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Carl Edwards, Qingyun Wang, Manling Li, Lawrence Zhao, Tom Hope, Heng Ji
Venues:
LangMol | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
85–90
Language:
URL:
https://aclanthology.org/2024.langmol-1.10
DOI:
Bibkey:
Cite (ACL):
Shinnosuke Tanaka, Carol Mak, Flaviu Cipcigan, James Barry, Mohab Elkaref, Movina Moses, Vishnudev Kuruvanthodi, and Geeth Mel. 2024. NLPeople at L+M-24 Shared Task: An Ensembled Approach for Molecule Captioning from SMILES. In Proceedings of the 1st Workshop on Language + Molecules (L+M 2024), pages 85–90, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
NLPeople at L+M-24 Shared Task: An Ensembled Approach for Molecule Captioning from SMILES (Tanaka et al., LangMol-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.langmol-1.10.pdf