W. Bruce Croft

Also published as: Bruce Croft


2021

pdf bib
AREDSUM: Adaptive Redundancy-Aware Iterative Sentence Ranking for Extractive Document Summarization
Keping Bi | Rahul Jha | Bruce Croft | Asli Celikyilmaz
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Redundancy-aware extractive summarization systems score the redundancy of the sentences to be included in a summary either jointly with their salience information or separately as an additional sentence scoring step. Previous work shows the efficacy of jointly scoring and selecting sentences with neural sequence generation models. It is, however, not well-understood if the gain is due to better encoding techniques or better redundancy reduction approaches. Similarly, the contribution of salience versus diversity components on the created summary is not studied well. Building on the state-of-the-art encoding methods for summarization, we present two adaptive learning models: AREDSUM-SEQ that jointly considers salience and novelty during sentence selection; and a two-step AREDSUM-CTX that scores salience first, then learns to balance salience and redundancy, enabling the measurement of the impact of each aspect. Empirical results on CNN/DailyMail and NYT50 datasets show that by modeling diversity explicitly in a separate step, AREDSUM-CTX achieves significantly better performance than AREDSUM-SEQ as well as state-of-the-art extractive summarization baselines.

2019

pdf bib
The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval
Constantine Lignos | Daniel Cohen | Yen-Chieh Lien | Pratik Mehta | W. Bruce Croft | Scott Miller
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

When performing cross-language information retrieval (CLIR) for lower-resourced languages, a common approach is to retrieve over the output of machine translation (MT). However, there is no established guidance on how to optimize the resulting MT-IR system. In this paper, we examine the relationship between the performance of MT systems and both neural and term frequency-based IR models to identify how CLIR performance can be best predicted from MT quality. We explore performance at varying amounts of MT training data, byte pair encoding (BPE) merge operations, and across two IR collections and retrieval models. We find that the choice of IR collection can substantially affect the predictive power of MT tuning decisions and evaluation, potentially introducing dissociations between MT-only and overall CLIR performance.

2018

pdf bib
Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce
Minghui Qiu | Liu Yang | Feng Ji | Wei Zhou | Jun Huang | Haiqing Chen | Bruce Croft | Wei Lin
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Building multi-turn information-seeking conversation systems is an important and challenging research topic. Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications. Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications. To alleviate these problems, we study transfer learning for multi-turn information seeking conversations in this paper. We first propose an efficient and effective multi-turn conversation model based on convolutional neural networks. After that, we extend our model to adapt the knowledge learned from a resource-rich domain to enhance the performance. Finally, we deployed our model in an industrial chatbot called AliMe Assist and observed a significant improvement over the existing online model.

2013

pdf bib
Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
K. Tamsin Maxwell | Jon Oberlander | W. Bruce Croft
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf bib
Joint Annotation of Search Queries
Michael Bendersky | W. Bruce Croft | David A. Smith
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2005

pdf bib
A Translation Model for Sentence Retrieval
Vanessa Murdock | W. Bruce Croft
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2001

pdf bib
Evaluating Question-Answering Techniques in Chinese
Xiaoyan Li | W. Bruce Croft
Proceedings of the First International Conference on Human Language Technology Research

1996

pdf bib
Chinese Information Extraction and Retrieval
Sean Boisen | Michael Crystal | Erik Peterson | Ralph Weischedel | John Broglio | Jamie Callan | Bruce Croft | Theresa Hand | Thomas Keenan | Mary Ellen Okurowski
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

1993

pdf bib
INQUERY System Overview
John Broglio | James P. Callan | W. Bruce Croft
TIPSTER TEXT PROGRAM: PHASE I: Proceedings of a Workshop held at Fredricksburg, Virginia, September 19-23, 1993

pdf bib
Query Processing for Retrieval From Large Text Bases
John Broglio | W. Bruce Croft
Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993

pdf bib
Text Retrieval and Routing Techniques Based on an Inference Net
W. Bruce Croft
Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993