Andreas Stolcke

Also published as: A. Stolcke


2024

pdf bib
Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output
Hithesh Sankararaman | Mohammed Nasheed Yasin | Tanner Sorensen | Alessandro Di Bari | Andreas Stolcke
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

We present a light-weight approach for detecting nonfactual outputs from retrieval-augemented generation (RAG). Given a context and putative output, we compute a factuality score that can be thresholded to yield a binary decision to check the results of LLM-based question-answering, summarization, or other systems. Unlike factuality checkers that themselves rely on LLMs, we use compact, open-source natural language inference (NLI) models that yield a freely accessible solution with low latency and low cost at run-time, and no need for LLM fine-tuning. The approach also enables downstream mitigation and correction of hallucinations, by tracing them back to specific context chunks. Our experiments show high ROC-AUC across a wide range of relevant open source datasets, indicating the effectiveness of our method for fact-checking RAG output.

2022

pdf bib
CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals
Scott Novotney | Sreeparna Mukherjee | Zeeshan Ahmed | Andreas Stolcke
Findings of the Association for Computational Linguistics: ACL 2022

We propose a framework to modularize the training of neural language models that use diverse forms of context by eliminating the need to jointly train context and within-sentence encoders. Our approach, contextual universal embeddings (CUE), trains LMs on one type of contextual data and adapts to novel context types. The model consists of a pretrained neural sentence LM, a BERT-based contextual encoder, and a masked transfomer decoder that estimates LM probabilities using sentence-internal and contextual evidence. When contextually annotated data is unavailable, our model learns to combine contextual and sentence-internal information using noisy oracle unigram embeddings as a proxy. Real context data can be introduced later and used to adapt a small number of parameters that map contextual data into the decoder’s embedding space. We validate the CUE framework on a NYTimes text corpus with multiple metadata types, for which the LM perplexity can be lowered from 36.6 to 27.4 by conditioning on context. Bootstrapping a contextual LM with only a subset of the metadata during training retains 85% of the achievable gain. Training the model initially with proxy context retains 67% of the perplexity gain after adapting to real context. Furthermore, we can swap one type of pretrained sentence LM for another without retraining the context encoders, by only adapting the decoder model. Overall, we obtain a modular framework that allows incremental, scalable training of context-enhanced LMs.

2021

pdf bib
Attention-based Contextual Language Model Adaptation for Speech Recognition
Richard Diehl Martinez | Scott Novotney | Ivan Bulyko | Ariya Rastrow | Andreas Stolcke | Ankur Gandhe
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2018

pdf bib
Session-level Language Modeling for Conversational Speech
Wayne Xiong | Lingfeng Wu | Jun Zhang | Andreas Stolcke
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose to generalize language models for conversational speech recognition to allow them to operate across utterance boundaries and speaker changes, thereby capturing conversation-level phenomena such as adjacency pairs, lexical entrainment, and topical coherence. The model consists of a long-short-term memory (LSTM) recurrent network that reads the entire word-level history of a conversation, as well as information about turn taking and speaker overlap, in order to predict each next word. The model is applied in a rescoring framework, where the word history prior to the current utterance is approximated with preliminary recognition results. In experiments in the conversational telephone speech domain (Switchboard) we find that such a model gives substantial perplexity reductions over a standard LSTM-LM with utterance scope, as well as improvements in word error rate.

2013

pdf bib
Using Out-of-Domain Data for Lexical Addressee Detection in Human-Human-Computer Dialog
Heeyoung Lee | Andreas Stolcke | Elizabeth Shriberg
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Cross-language Study on Automatic Speech Disfluency Detection
Wen Wang | Andreas Stolcke | Jiahong Yuan | Mark Liberman
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2007

pdf bib
Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages
Mathias Creutz | Teemu Hirsimäki | Mikko Kurimo | Antti Puurula | Janne Pylkkönen | Vesa Siivola | Matti Varjokallio | Ebru Arisoy | Murat Saraçlar | Andreas Stolcke
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2005

pdf bib
Using Conditional Random Fields for Sentence Boundary Detection in Speech
Yang Liu | Andreas Stolcke | Elizabeth Shriberg | Mary Harper
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech
Yang Liu | Andreas Stolcke | Elizabeth Shriberg | Mary Harper
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improving Automatic Sentence Boundary Detection with Confusion Networks
D. Hillard | M. Ostendorf | A. Stolcke | Y. Liu | E. Shriberg
Proceedings of HLT-NAACL 2004: Short Papers

2003

pdf bib
Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures
Ivan Bulyko | Mari Ostendorf | Andreas Stolcke
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

2001

pdf bib
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation
G. Tur | D. Hakkani-Tur | A. Stolcke | E. Shriberg
Computational Linguistics, Volume 27, Number 1, March 2001

pdf bib
The Meeting Project at ICSI
Nelson Morgan | Don Baron | Jane Edwards | Dan Ellis | David Gelbart | Adam Janin | Thilo Pfau | Elizabeth Shriberg | Andreas Stolcke
Proceedings of the First International Conference on Human Language Technology Research

2000

pdf bib
Dialogue act modeling for automatic tagging and recognition of conversational speech
Andreas Stolcke | Klaus Ries | Noah Coccaro | Elizabeth Shriberg | Rebecca Bates | Daniel Jurafsky | Paul Taylor | Rachel Martin | Carol Van Ess-Dykema | Marie Meteer
Computational Linguistics, Volume 26, Number 3, September 2000

1995

pdf bib
Partitioning Grammars and Composing Parsers
Fuliang Weng | Andreas Stolcke
Proceedings of the Fourth International Workshop on Parsing Technologies

pdf bib
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
Andreas Stolcke
Computational Linguistics, Volume 21, Number 2, June 1995

1994

pdf bib
Precise N-Gram Probabilities From Stochastic Context-Free Grammars
Andreas Stolcke | Jonathan Segal
32nd Annual Meeting of the Association for Computational Linguistics

1990

pdf bib
Gapping and Frame Semantics: A fresh look from a cognitive perspective
Andreas Stolcke
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics