James Gung


pdf bib
NatCS: Eliciting Natural Customer Support Dialogues
James Gung | Emily Moeng | Wesley Rose | Arshit Gupta | Yi Zhang | Saab Mansour
Findings of the Association for Computational Linguistics: ACL 2023

Despite growing interest in applications based on natural customer support conversations,there exist remarkably few publicly available datasets that reflect the expected characteristics of conversations in these settings. Existing task-oriented dialogue datasets, which were collected to benchmark dialogue systems mainly in written human-to-bot settings, are not representative of real customer support conversations and do not provide realistic benchmarks for systems that are applied to natural data. To address this gap, we introduce NatCS, a multi-domain collection of spoken customer service conversations. We describe our process for collecting synthetic conversations between customers and agents based on natural language phenomena observed in real conversations. Compared to previous dialogue datasets, the conversations collected with our approach are more representative of real human-to-human conversations along multiple metrics. Finally, we demonstrate potential uses of NatCS, including dialogue act classification and intent induction from conversations as potential applications, showing that dialogue act annotations in NatCS provide more effective training data for modeling real conversations compared to existing synthetic written datasets. We publicly release NatCS to facilitate research in natural dialog systems


pdf bib
PropBank Comes of Age—Larger, Smarter, and more Diverse
Sameer Pradhan | Julia Bonn | Skatje Myers | Kathryn Conger | Tim O’gorman | James Gung | Kristin Wright-bettner | Martha Palmer
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

This paper describes the evolution of the PropBank approach to semantic role labeling over the last two decades. During this time the PropBank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems


pdf bib
Predicate Representations and Polysemy in VerbNet Semantic Parsing
James Gung | Martha Palmer
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

Despite recent advances in semantic role labeling propelled by pre-trained text encoders like BERT, performance lags behind when applied to predicates observed infrequently during training or to sentences in new domains. In this work, we investigate how role labeling performance on low-frequency predicates and out-of-domain data can be further improved by using VerbNet, a verb lexicon that groups verbs into hierarchical classes based on shared syntactic and semantic behavior and defines semantic representations describing relations between arguments. We find that VerbNet classes provide an effective level of abstraction, improving generalization on low-frequency predicates by allowing them to learn from the training examples of other predicates belonging to the same class. We also find that joint training of VerbNet role labeling and predicate disambiguation of VerbNet classes for polysemous verbs leads to improvements in both tasks, naturally supporting the extraction of VerbNet’s semantic representations.

pdf bib
SemLink 2.0: Chasing Lexical Resources
Kevin Stowe | Jenette Preciado | Kathryn Conger | Susan Windisch Brown | Ghazaleh Kazeminejad | James Gung | Martha Palmer
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

The SemLink resource provides mappings between a variety of lexical semantic ontologies, each with their strengths and weaknesses. To take advantage of these differences, the ability to move between resources is essential. This work describes advances made to improve the usability of the SemLink resource: the automatic addition of new instances and mappings, manual corrections, sense-based vectors and collocation information, and architecture built to automatically update the resource when versions of the underlying resources change. These updates improve coverage, provide new tools to leverage the capabilities of these resources, and facilitate seamless updates, ensuring the consistency and applicability of these mappings in the future.


pdf bib
VerbNet Representations: Subevent Semantics for Transfer Verbs
Susan Windisch Brown | Julia Bonn | James Gung | Annie Zaenen | James Pustejovsky | Martha Palmer
Proceedings of the First International Workshop on Designing Meaning Representations

This paper announces the release of a new version of the English lexical resource VerbNet with substantially revised semantic representations designed to facilitate computer planning and reasoning based on human language. We use the transfer of possession and transfer of information event representations to illustrate both the general framework of the representations and the types of nuances the new representations can capture. These representations use a Generative Lexicon-inspired subevent structure to track attributes of event participants across time, highlighting oppositions and temporal and causal relations among the subevents.


pdf bib
The New Propbank: Aligning Propbank with AMR through POS Unification
Tim O’Gorman | Sameer Pradhan | Martha Palmer | Julia Bonn | Katie Conger | James Gung
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


pdf bib
Word Substitution in Short Answer Extraction: A WordNet-based Approach
Qingqing Cai | James Gung | Maochen Guan | Gerald Kurlandski | Adam Pease
Proceedings of the 8th Global WordNet Conference (GWC)

We describe the implementation of a short answer extraction system. It consists of a simple sentence selection front-end and a two phase approach to answer extraction from a sentence. In the first phase sentence classification is performed with a classifier trained with the passive aggressive algorithm utilizing the UIUC dataset and taxonomy and a feature set including word vectors. This phase outperforms the current best published results on that dataset. In the second phase, a sieve algorithm consisting of a series of increasingly general extraction rules is applied, using WordNet to find word types aligned with the UIUC classifications determined in the first phase. Some very preliminary performance metrics are presented.


pdf bib
CUAB: Supervised Learning of Disorders and their Attributes using Relations
James Gung | John Osborne | Steven Bethard
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)


pdf bib
Summarization of Historical Articles Using Temporal Event Clustering
James Gung | Jugal Kalita
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies