Yee Seng Chan


2022

pdf bib
Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction
Mihai Surdeanu | John Hungerford | Yee Seng Chan | Jessica MacBride | Benjamin Gyori | Andrew Zupon | Zheng Tang | Haoling Qiu | Bonan Min | Yan Zverev | Caitlin Hilverman | Max Thomas | Walter Andrews | Keith Alcock | Zeyu Zhang | Michael Reynolds | Steven Bethard | Rebecca Sharp | Egoitz Laparra
Proceedings of the Second Workshop on Bridging Human--Computer Interaction and Natural Language Processing

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended. This is a slow and tedious task, and one which does not scale well. Here we propose an interactive tool that allows a taxonomy to be built or extended rapidly and with a human in the loop to control precision. We apply insights from text summarization and information extraction to reduce the search space dramatically, then leverage modern pretrained language models to perform contextualized clustering of the remaining concepts to yield candidate nodes for the user to review. We show this allows a user to consider as many as 200 taxonomy concept candidates an hour, to quickly build or extend a taxonomy to better fit information needs.

2020

pdf bib
Towards Few-Shot Event Mention Retrieval: An Evaluation Framework and A Siamese Network Approach
Bonan Min | Yee Seng Chan | Lingjun Zhao
Proceedings of the Twelfth Language Resources and Evaluation Conference

Automatically analyzing events in a large amount of text is crucial for situation awareness and decision making. Previous approaches treat event extraction as “one size fits all” with an ontology defined a priori. The resulted extraction models are built just for extracting those types in the ontology. These approaches cannot be easily adapted to new event types nor new domains of interest. To accommodate personalized event-centric information needs, this paper introduces the few-shot Event Mention Retrieval (EMR) task: given a user-supplied query consisting of a handful of event mentions, return relevant event mentions found in a corpus. This formulation enables “query by example”, which drastically lowers the bar of specifying event-centric information needs. The retrieval setting also enables fuzzy search. We present an evaluation framework leveraging existing event datasets such as ACE. We also develop a Siamese Network approach, and show that it performs better than ad-hoc retrieval models in the few-shot EMR setting.

2019

pdf bib
Towards Machine Reading for Interventions from Humanitarian-Assistance Program Literature
Bonan Min | Yee Seng Chan | Haoling Qiu | Joshua Fasching
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Solving long-lasting problems such as food insecurity requires a comprehensive understanding of interventions applied by governments and international humanitarian assistance organizations, and their results and consequences. Towards achieving this grand goal, a crucial first step is to extract past interventions and when and where they have been applied, from hundreds of thousands of reports automatically. In this paper, we developed a corpus annotated with interventions to foster research, and developed an information extraction system for extracting interventions and their location and time from text. We demonstrate early, very encouraging results on extracting interventions.

pdf bib
Rapid Customization for Event Extraction
Yee Seng Chan | Joshua Fasching | Haoling Qiu | Bonan Min
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Extracting events in the form of who is involved in what at when and where from text, is one of the core information extraction tasks that has many applications such as web search and question answering. We present a system for rapidly customizing event extraction capability to find new event types (what happened) and their arguments (who, when, and where). To enable extracting events of new types, we develop a novel approach to allow a user to find, expand and filter event triggers by exploring an unannotated development corpus. The system will then generate mention level event annotation automatically and train a neural network model for finding the corresponding events. To enable extracting arguments for new event types, the system makes novel use of the ACE annotation dataset to train a generic argument attachment model for extracting Actor, Place, and Time. We demonstrate that with less than 10 minutes of human effort per event type, the system achieves good performance for 67 novel event types. Experiments also show that the generic argument attachment model performs well on the novel event types. Our system (code, UI, documentation, demonstration video) is released as open source.

2011

pdf bib
Minimally Supervised Event Causality Identification
Quang Do | Yee Seng Chan | Dan Roth
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Exploiting Syntactico-Semantic Structures for Relation Extraction
Yee Seng Chan | Dan Roth
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Exploiting Background Knowledge for Relation Extraction
Yee Seng Chan | Dan Roth
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2008

pdf bib
Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
David Chiang | Steve DeNeefe | Yee Seng Chan | Hwee Tou Ng
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Word Sense Disambiguation Using OntoNotes: An Empirical Study
Zhi Zhong | Hwee Tou Ng | Yee Seng Chan
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation
Yee Seng Chan | Hwee Tou Ng
Proceedings of ACL-08: HLT

2007

pdf bib
Word Sense Disambiguation Improves Statistical Machine Translation
Yee Seng Chan | Hwee Tou Ng | David Chiang
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Domain Adaptation with Active Learning for Word Sense Disambiguation
Yee Seng Chan | Hwee Tou Ng
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
SemEval-2007 Task 11: English Lexical Sample Task via English-Chinese Parallel Text
Hwee Tou Ng | Yee Seng Chan
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks
Yee Seng Chan | Hwee Tou Ng | Zhi Zhong
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation
Yee Seng Chan | Hwee Tou Ng
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2003

pdf bib
Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study
Hwee Tou Ng | Bin Wang | Yee Seng Chan
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics