Jie Ma


2022

pdf bib
Label Semantics for Few Shot Named Entity Recognition
Jie Ma | Miguel Ballesteros | Srikanth Doss | Rishita Anubhai | Sunil Mallya | Yaser Al-Onaizan | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2022

We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format. Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder. The label semantics signal is shown to support improved state-of-the-art results in multiple few shot NER benchmarks and on-par performance in standard benchmarks. Our model is especially effective in low resource settings.

2020

pdf bib
Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events
Miguel Ballesteros | Rishita Anubhai | Shuai Wang | Nima Pourdamghani | Yogarshi Vyas | Jie Ma | Parminder Bhatia | Kathleen McKeown | Yaser Al-Onaizan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ELMo), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. Experiments on the MATRES dataset of English documents establish a new state-of-the-art on this task.

pdf bib
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
Kasturi Bhattacharjee | Miguel Ballesteros | Rishita Anubhai | Smaranda Muresan | Jie Ma | Faisal Ladhak | Yaser Al-Onaizan
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. CVT uses a much lighter model architecture and we show that it achieves similar performance to BERT on a set of sequence tagging tasks, with lesser financial and environmental impact.

pdf bib
When and Who? Conversation Transition Based on Bot-Agent Symbiosis Learning Network
Yipeng Yu | Ran Guan | Jie Ma | Zhuoxuan Jiang | Jingchang Huang
Proceedings of the 28th International Conference on Computational Linguistics

In online customer service applications, multiple chatbots that are specialized in various topics are typically developed separately and are then merged with other human agents to a single platform, presenting to the users with a unified interface. Ideally the conversation can be transparently transferred between different sources of customer support so that domain-specific questions can be answered timely and this is what we coined as a Bot-Agent symbiosis. Conversation transition is a major challenge in such online customer service and our work formalises the challenge as two core problems, namely, when to transfer and which bot or agent to transfer to and introduces a deep neural networks based approach that addresses these problems. Inspired by the net promoter score (NPS), our research reveals how the problems can be effectively solved by providing user feedback and developing deep neural networks that predict the conversation category distribution and the NPS of the dialogues. Experiments on realistic data generated from an online service support platform demonstrate that the proposed approach outperforms state-of-the-art methods and shows promising perspective for transparent conversation transition.

pdf bib
Resource-Enhanced Neural Model for Event Argument Extraction
Jie Ma | Shuai Wang | Rishita Anubhai | Miguel Ballesteros | Yaser Al-Onaizan
Findings of the Association for Computational Linguistics: EMNLP 2020

Event argument extraction (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. Despite great efforts made in prior work, there remain many challenges: (1) Data scarcity. (2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument. (3) Integrating event trigger information into candidate argument representation. For (1), we explore using unlabeled data. For (2), we use Transformer that uses dependency parses to guide the attention mechanism. For (3), we propose a trigger-aware sequence encoder with several types of trigger-dependent sequence representations. We also support argument extraction either from text annotated with gold entities or from plain text. Experiments on the English ACE 2005 benchmark show that our approach achieves a new state-of-the-art.

2019

pdf bib
Towards End-to-End Learning for Efficient Dialogue Agent by Modeling Looking-ahead Ability
Zhuoxuan Jiang | Xian-Ling Mao | Ziming Huang | Jie Ma | Shaochun Li
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Learning an efficient manager of dialogue agent from data with little manual intervention is important, especially for goal-oriented dialogues. However, existing methods either take too many manual efforts (e.g. reinforcement learning methods) or cannot guarantee the dialogue efficiency (e.g. sequence-to-sequence methods). In this paper, we address this problem by proposing a novel end-to-end learning model to train a dialogue agent that can look ahead for several future turns and generate an optimal response to make the dialogue efficient. Our method is data-driven and does not require too much manual work for intervention during system design. We evaluate our method on two datasets of different scenarios and the experimental results demonstrate the efficiency of our model.