Leveraging Language-based Representations for Better Solving Symbol-related Problems with Large Language Models

Yile Wang, Sijie Cheng, Zixin Sun, Peng Li, Yang Liu


Abstract
Symbols such as numerical sequences, chemical formulas, and table delimiters exist widely, playing important roles in symbol-related tasks such as abstract reasoning, chemical property prediction, and tabular question-answering. Compared to tasks based on natural language expressions, large language models (LLMs) have limitations in understanding and reasoning on symbol-based representations, making it difficult for them to handle symbol-related problems. In this paper, we propose symbol-to-language (S2L), a method that converts symbol-based representations to language-based representations, providing valuable information for language models during reasoning. We found that, for both closed-source and open-source LLMs, the capability to solve symbol-related problems can be largely enhanced by incorporating such language-based representations. For example, by employing S2L for GPT-4, there can be substantial improvements of +21.9% and +9.5% accuracy for 1D-ARC and Dyck language tasks, respectively. There is also a consistent improvement in other six general symbol-related tasks such as table understanding and Tweet analysis. We release the GPT logs in https://github.com/THUNLP-MT/symbol2language.
Anthology ID:
2025.coling-main.372
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5544–5557
Language:
URL:
https://aclanthology.org/2025.coling-main.372/
DOI:
Bibkey:
Cite (ACL):
Yile Wang, Sijie Cheng, Zixin Sun, Peng Li, and Yang Liu. 2025. Leveraging Language-based Representations for Better Solving Symbol-related Problems with Large Language Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5544–5557, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Leveraging Language-based Representations for Better Solving Symbol-related Problems with Large Language Models (Wang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.372.pdf