Jim Glass


2018

pdf bib
Learning Word Representations with Cross-Sentence Dependency for End-to-End Co-reference Resolution
Hongyin Luo | Jim Glass
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this work, we present a word embedding model that learns cross-sentence dependency for improving end-to-end co-reference resolution (E2E-CR). While the traditional E2E-CR model generates word representations by running long short-term memory (LSTM) recurrent neural networks on each sentence of an input article or conversation separately, we propose linear sentence linking and attentional sentence linking models to learn cross-sentence dependency. Both sentence linking strategies enable the LSTMs to make use of valuable information from context sentences while calculating the representation of the current input word. With this approach, the LSTMs learn word embeddings considering knowledge not only from the current sentence but also from the entire input document. Experiments show that learning cross-sentence dependency enriches information contained by the word representations, and improves the performance of the co-reference resolution model compared with our baseline.

2016

pdf bib
SemEval-2016 Task 3: Community Question Answering
Preslav Nakov | Lluís Màrquez | Alessandro Moschitti | Walid Magdy | Hamdy Mubarak | Abed Alhakim Freihat | Jim Glass | Bilal Randeree
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering
Mitra Mohtarami | Yonatan Belinkov | Wei-Ning Hsu | Yu Zhang | Tao Lei | Kfir Bar | Scott Cyphers | Jim Glass
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
SemEval-2015 Task 3: Answer Selection in Community Question Answering
Preslav Nakov | Lluís Màrquez | Walid Magdy | Alessandro Moschitti | Jim Glass | Bilal Randeree
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
A Vector Space Approach for Aspect Based Sentiment Analysis
Abdulaziz Alghunaim | Mitra Mohtarami | Scott Cyphers | Jim Glass
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

2014

pdf bib
A Study of using Syntactic and Semantic Structures for Concept Segmentation and Labeling
Iman Saleh | Scott Cyphers | Jim Glass | Shafiq Joty | Lluís Màrquez | Alessandro Moschitti | Preslav Nakov
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Probabilistic Dialogue Modeling for Speech-Enabled Assistive Technology
William Li | Jim Glass | Nicholas Roy | Seth Teller
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

2010

pdf bib
Collecting Voices from the Cloud
Ian McGraw | Chia-ying Lee | Lee Hetherington | Stephanie Seneff | Jim Glass
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The collection and transcription of speech data is typically an expensive and time-consuming task. Voice over IP and cloud computing are poised to greatly reduce this impediment to research on spoken language interfaces in many domains. This paper documents our efforts to deploy speech-enabled web interfaces to large audiences over the Internet via Amazon Mechanical Turk, an online marketplace for work. Using the open source WAMI Toolkit, we collected corpora in two different domains which collectively constitute over 113 hours of speech. The first corpus contains 100,000 utterances of read speech, and was collected by asking workers to record street addresses in the United States. For the second task, we collected conversations with FlightBrowser, a multimodal spoken dialogue system. The FlightBrowser corpus obtained contains 10,651 utterances composing 1,113 individual dialogue sessions from 101 distinct users. The aggregate time spent collecting the data for both corpora was just under two weeks. At times, our servers were logging audio from workers at rates faster than real-time. We describe the process of collection and transcription of these corpora while providing an analysis of the advantages and limitations to this data collection method.

2006

pdf bib
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts
Chris Manning | Doug Oard | Jim Glass
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts