Automotive Document Labeling Using Large Language Models

Dang Van Thin; Cuong Xuan Chu; Christian Graf; Tobias Kaminski; Trung-Kien Tran

doi:10.18653/v1/2025.emnlp-industry.112

Automotive Document Labeling Using Large Language Models

Dang Van Thin, Cuong Xuan Chu, Christian Graf, Tobias Kaminski, Trung-Kien Tran

Abstract

Repairing and maintaining car parts are crucial tasks in the automotive industry, requiring a mechanic to have all relevant technical documents available. However, retrieving the right documents from a huge database heavily depends on domain expertise and is time consuming and error-prone. By labeling available documents according to the components they relate to, concise and accurate information can be retrieved efficiently. However, this is a challenging task as the relevance of a document to a particular component strongly depends on the context and the expertise of the domain specialist. Moreover, component terminology varies widely between different manufacturers. We address these challenges by utilizing Large Language Models (LLMs) to enrich and unify a component database via web mining, extracting relevant keywords, and leveraging hybrid search and LLM-based re-ranking to select the most relevant component for a document. We systematically evaluate our method using various LLMs on an expert-annotated dataset and demonstrate that it outperforms the baselines, which rely solely on LLM prompting.

Anthology ID:: 2025.emnlp-industry.112
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2025
Address:: Suzhou (China)
Editors:: Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1588–1595
Language:
URL:: https://aclanthology.org/2025.emnlp-industry.112/
DOI:: 10.18653/v1/2025.emnlp-industry.112
Bibkey:
Cite (ACL):: Dang Van Thin, Cuong Xuan Chu, Christian Graf, Tobias Kaminski, and Trung-Kien Tran. 2025. Automotive Document Labeling Using Large Language Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1588–1595, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):: Automotive Document Labeling Using Large Language Models (Van Thin et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-industry.112.pdf

PDF Cite Search Fix data