A Semantic Search Engine for Mathlib4

Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, Bin Dong


Abstract
The interactive theorem prover Lean enables the verification of formal mathematical proofs and is backed by an expanding community. Central to this ecosystem is its mathematical library, mathlib4, which lays the groundwork for the formalization of an expanding range of mathematical theories. However, searching for theorems in mathlib4 can be challenging. To successfully search in mathlib4, users often need to be familiar with its naming conventions or documentation strings. Therefore, creating a semantic search engine that can be used easily by individuals with varying familiarity with mathlib4 is very important. In this paper, we present a semantic search engine for mathlib4 that accepts informal queries and finds the relevant theorems. We also establish a benchmark for assessing the performance of various search engines for mathlib4.
Anthology ID:
2024.findings-emnlp.470
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8001–8013
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.470
DOI:
Bibkey:
Cite (ACL):
Guoxiong Gao, Haocheng Ju, Jiedong Jiang, Zihan Qin, and Bin Dong. 2024. A Semantic Search Engine for Mathlib4. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 8001–8013, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
A Semantic Search Engine for Mathlib4 (Gao et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.470.pdf
Data:
 2024.findings-emnlp.470.data.zip