Roman Teucher


2023

pdf bib
CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering
Md Rashad Al Hasan Rony | Christian Suess | Sinchana Ramakanth Bhat | Viju Sudhi | Julia Schneider | Maximilian Vogel | Roman Teucher | Ken Friedl | Soumya Sahoo
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

Large language models (LLMs) have demonstrated remarkable performance by following natural language instructions without fine-tuning them on domain-specific tasks and data. However, leveraging LLMs for domain-specific question answering suffers from severe limitations. The generated answer tends to hallucinate due to the training data collection time (when using off-the-shelf), complex user utterance and wrong retrieval (in retrieval-augmented generation). Furthermore, due to the lack of awareness about the domain and expected output, such LLMs may generate unexpected and unsafe answers that are not tailored to the target domain. In this paper, we propose CarExpert, an in-car retrieval-augmented conversational question-answering system leveraging LLMs for different tasks. Specifically, CarExpert employs LLMs to control the input, provide domain-specific documents to the extractive and generative answering components, and controls the output to ensure safe and domain-specific answers. A comprehensive empirical evaluation exhibits that CarExpert outperforms state-of-the-art LLMs in generating natural, safe and car-specific answers.

2022

pdf bib
TextGraphs-16 Natural Language Premise Selection Task: Zero-Shot Premise Selection with Prompting Generative Language Models
Liubov Kovriguina | Roman Teucher | Robert Wardenga
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing

Automated theorem proving can benefit a lot from methods employed in natural language processing, knowledge graphs and information retrieval: this non-trivial task combines formal languages understanding, reasoning, similarity search. We tackle this task by enhancing semantic similarity ranking with prompt engineering, which has become a new paradigm in natural language understanding. None of our approaches requires additional training. Despite encouraging results reported by prompt engineering approaches for a range of NLP tasks, for the premise selection task vanilla re-ranking by prompting GPT-3 doesn’t outperform semantic similarity ranking with SBERT, but merging of the both rankings shows better results.