Mariefel Olarte
2023
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design
Henry Sprueill
|
Carl Edwards
|
Mariefel Olarte
|
Udishnu Sanyal
|
Heng Ji
|
Sutanay Choudhury
Findings of the Association for Computational Linguistics: EMNLP 2023
Discovering novel catalysts requires complex reasoning involving multiple chemical properties and resultant trade-offs, leading to a combinatorial growth in the search space. While large language models (LLM) have demonstrated novel capabilities for chemistry through complex instruction following capabilities and high quality reasoning, a goal-driven combinatorial search using LLMs has not been explored in detail. In this work, we present a Monte Carlo Tree Search-based approach that improves beyond state-of-the-art chain-of-thought prompting variants to augment scientific reasoning. We introduce two new reasoning datasets: 1) a curation of computational chemistry simulations, and 2) diverse questions written by catalysis researchers for reasoning about novel chemical conversion processes. We improve over the best baseline by 25.8% and find that our approach can augment scientist’s reasoning and discovery process with novel insights.