Trung Bui

Also published as: Trung H. Bui


2021

pdf bib
KPQA: A Metric for Generative Question Answering Using Keyphrase Weights
Hwanhee Lee | Seunghyun Yoon | Franck Dernoncourt | Doo Soon Kim | Trung Bui | Joongbo Shin | Kyomin Jung
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

In the automatic evaluation of generative question answering (GenQA) systems, it is difficult to assess the correctness of generated answers due to the free-form of the answer. Especially, widely used n-gram similarity metrics often fail to discriminate the incorrect answers since they equally consider all of the tokens. To alleviate this problem, we propose KPQA metric, a new metric for evaluating the correctness of GenQA. Specifically, our new metric assigns different weights to each token via keyphrase prediction, thereby judging whether a generated answer sentence captures the key meaning of the reference answer. To evaluate our metric, we create high-quality human judgments of correctness on two GenQA datasets. Using our human-evaluation datasets, we show that our proposed metric has a significantly higher correlation with human judgments than existing metrics in various datasets. Code for KPQA-metric will be available at https://github.com/hwanheelee1993/KPQA.

pdf bib
A Context-Dependent Gated Module for Incorporating Symbolic Semantics into Event Coreference Resolution
Tuan Lai | Heng Ji | Trung Bui | Quan Hung Tran | Franck Dernoncourt | Walter Chang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Event coreference resolution is an important research problem with many applications. Despite the recent remarkable success of pre-trained language models, we argue that it is still highly beneficial to utilize symbolic features for the task. However, as the input for coreference resolution typically comes from upstream components in the information extraction pipeline, the automatically extracted symbolic features can be noisy and contain errors. Also, depending on the specific context, some features can be more informative than others. Motivated by these observations, we propose a novel context-dependent gated module to adaptively control the information flows from the input symbolic features. Combined with a simple noisy training method, our best models achieve state-of-the-art results on two datasets: ACE 2005 and KBP 2016.

pdf bib
X-METRA-ADA: Cross-lingual Meta-Transfer learning Adaptation to Natural Language Understanding and Question Answering
Meryem M’hamdi | Doo Soon Kim | Franck Dernoncourt | Trung Bui | Xiang Ren | Jonathan May
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Multilingual models, such as M-BERT and XLM-R, have gained increasing popularity, due to their zero-shot cross-lingual transfer learning capabilities. However, their generalization ability is still inconsistent for typologically diverse languages and across different benchmarks. Recently, meta-learning has garnered attention as a promising technique for enhancing transfer learning under low-resource scenarios: particularly for cross-lingual transfer in Natural Language Understanding (NLU). In this work, we propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for NLU. Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages. We extensively evaluate our framework on two challenging cross-lingual NLU tasks: multilingual task-oriented dialog and typologically diverse question answering. We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages. Our analysis reveals that X-METRA-ADA can leverage limited data for faster adaptation.

pdf bib
Open-Domain Question Answering with Pre-Constructed Question Spaces
Jinfeng Xiao | Lidan Wang | Franck Dernoncourt | Trung Bui | Tong Sun | Jiawei Han
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Open-domain question answering aims at locating the answers to user-generated questions in massive collections of documents. Retriever-readers and knowledge graph approaches are two big families of solutions to this task. A retriever-reader first applies information retrieval techniques to locate a few passages that are likely to be relevant, and then feeds the retrieved text to a neural network reader to extract the answer. Alternatively, knowledge graphs can be constructed and queried to answer users’ questions. We propose an algorithm with a novel reader-retriever design that differs from both families. Our reader-retriever first uses an offline reader to read the corpus and generate collections of all answerable questions associated with their answers, and then uses an online retriever to respond to user queries by searching the pre-constructed question spaces for answers that are most likely to be asked in the given way. We further combine one retriever-reader and two reader-retrievers into a hybrid model called R6 for the best performance. Experiments with two large-scale public datasets show that R6 achieves state-of-the-art accuracy.

pdf bib
A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question Understanding
Khalil Mrini | Franck Dernoncourt | Seunghyun Yoon | Trung Bui | Walter Chang | Emilia Farcas | Ndapa Nakashole
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Users of medical question answering systems often submit long and detailed questions, making it hard to achieve high recall in answer retrieval. To alleviate this problem, we propose a novel Multi-Task Learning (MTL) method with data augmentation for medical question understanding. We first establish an equivalence between the tasks of question summarization and Recognizing Question Entailment (RQE) using their definitions in the medical domain. Based on this equivalence, we propose a data augmentation algorithm to use just one dataset to optimize for both tasks, with a weighted MTL loss. We introduce gradually soft parameter-sharing: a constraint for decoder parameters to be close, that is gradually loosened as we move to the highest layer. We show through ablation studies that our proposed novelties improve performance. Our method outperforms existing MTL methods across 4 datasets of medical question pairs, in ROUGE scores, RQE accuracy and human evaluation. Finally, we show that our method fares better than single-task learning under 4 low-resource settings.

pdf bib
UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning
Hwanhee Lee | Seunghyun Yoon | Franck Dernoncourt | Trung Bui | Kyomin Jung
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Despite the success of various text generation metrics such as BERTScore, it is still difficult to evaluate the image captions without enough reference captions due to the diversity of the descriptions. In this paper, we introduce a new metric UMIC, an Unreferenced Metric for Image Captioning which does not require reference captions to evaluate image captions. Based on Vision-and-Language BERT, we train UMIC to discriminate negative captions via contrastive learning. Also, we observe critical problems of the previous benchmark dataset (i.e., human annotations) on image captioning metric, and introduce a new collection of human annotations on the generated captions. We validate UMIC on four datasets, including our new dataset, and show that UMIC has a higher correlation than all previous metrics that require multiple references.

pdf bib
UCSD-Adobe at MEDIQA 2021: Transfer Learning and Answer Sentence Selection for Medical Summarization
Khalil Mrini | Franck Dernoncourt | Seunghyun Yoon | Trung Bui | Walter Chang | Emilia Farcas | Ndapa Nakashole
Proceedings of the 20th Workshop on Biomedical Language Processing

In this paper, we describe our approach to question summarization and multi-answer summarization in the context of the 2021 MEDIQA shared task (Ben Abacha et al., 2021). We propose two kinds of transfer learning for the abstractive summarization of medical questions. First, we train on HealthCareMagic, a large question summarization dataset collected from an online healthcare service platform. Second, we leverage the ability of the BART encoder-decoder architecture to model both generation and classification tasks to train on the task of Recognizing Question Entailment (RQE) in the medical domain. We show that both transfer learning methods combined achieve the highest ROUGE scores. Finally, we cast the question-driven extractive summarization of multiple relevant answer documents as an Answer Sentence Selection (AS2) problem. We show how we can preprocess the MEDIQA-AnS dataset such that it can be trained in an AS2 setting. Our AS2 model is able to generate extractive summaries achieving high ROUGE scores.

pdf bib
Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks?
Thang Pham | Trung Bui | Long Mai | Anh Nguyen
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents
Tuan Lai | Trung Bui | Doo Soon Kim | Quan Hung Tran
Proceedings of the 28th International Conference on Computational Linguistics

Keyphrase extraction is the task of extracting a small set of phrases that best describe a document. Most existing benchmark datasets for the task typically have limited numbers of annotated documents, making it challenging to train increasingly complex neural networks. In contrast, digital libraries store millions of scientific articles online, covering a wide range of topics. While a significant portion of these articles contain keyphrases provided by their authors, most other articles lack such kind of annotations. Therefore, to effectively utilize these large amounts of unlabeled articles, we propose a simple and efficient joint learning approach based on the idea of self-distillation. Experimental results show that our approach consistently improves the performance of baseline models for keyphrase extraction. Furthermore, our best models outperform previous methods for the task, achieving new state-of-the-art results on two public benchmarks: Inspec and SemEval-2017.

pdf bib
Adjusting Image Attributes of Localized Regions with Low-level Dialogue
Tzu-Hsiang Lin | Alexander Rudnicky | Trung Bui | Doo Soon Kim | Jean Oh
Proceedings of the 12th Language Resources and Evaluation Conference

Natural Language Image Editing (NLIE) aims to use natural language instructions to edit images. Since novices are inexperienced with image editing techniques, their instructions are often ambiguous and contain high-level abstractions which require complex editing steps. Motivated by this inexperience aspect, we aim to smooth the learning curve by teaching the novices to edit images using low-level command terminologies. Towards this end, we develop a task-oriented dialogue system to investigate low-level instructions for NLIE. Our system grounds language on the level of edit operations, and suggests options for users to choose from. Though compelled to express in low-level terms, user evaluation shows that 25% of users found our system easy-to-use, resonating with our motivation. Analysis shows that users generally adapt to utilizing the proposed low-level language interface. We also identified object segmentation as the key factor to user satisfaction. Our work demonstrates advantages of low-level, direct language-action mapping approach that can be applied to other problem domains beyond image editing such as audio editing or industrial design.

pdf bib
Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks
Seunghyun Yoon | Franck Dernoncourt | Doo Soon Kim | Trung Bui | Kyomin Jung
Proceedings of the 12th Language Resources and Evaluation Conference

In this study, we propose a novel graph neural network called propagate-selector (PS), which propagates information over sentences to understand information that cannot be inferred when considering sentences in isolation. First, we design a graph structure in which each node represents an individual sentence, and some pairs of nodes are selectively connected based on the text structure. Then, we develop an iterative attentive aggregation and a skip-combine method in which a node interacts with its neighborhood nodes to accumulate the necessary information. To evaluate the performance of the proposed approaches, we conduct experiments with the standard HotpotQA dataset. The empirical results demonstrate the superiority of our proposed approach, which obtains the best performances, compared to the widely used answer-selection models that do not consider the intersentential relationship.

pdf bib
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Khalil Mrini | Franck Dernoncourt | Quan Hung Tran | Trung Bui | Walter Chang | Ndapa Nakashole
Findings of the Association for Computational Linguistics: EMNLP 2020

Attention mechanisms have improved the performance of NLP tasks while allowing models to remain explainable. Self-attention is currently widely used, however interpretability is difficult due to the numerous attention distributions. Recent work has shown that model representations can benefit from label-specific information, while facilitating interpretation of predictions. We introduce the Label Attention Layer: a new form of self-attention where attention heads represent labels. We test our novel layer by running constituency and dependency parsing experiments and show our new model obtains new state-of-the-art results for both tasks on both the Penn Treebank (PTB) and Chinese Treebank. Additionally, our model requires fewer self-attention layers compared to existing work. Finally, we find that the Label Attention heads learn relations between syntactic categories and show pathways to analyze errors.

pdf bib
Scene Graph Modification Based on Natural Language Commands
Xuanli He | Quan Hung Tran | Gholamreza Haffari | Walter Chang | Zhe Lin | Trung Bui | Franck Dernoncourt | Nhan Dam
Findings of the Association for Computational Linguistics: EMNLP 2020

Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems. In recent years, the advancements in multi-turn user interfaces necessitate the need for controlling and updating these structured representations given new sources of information. Although there have been many efforts focusing on improving the performance of the parsers that map text to graphs or parse trees, very few have explored the problem of directly manipulating these representations. In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing scene graph given a new user’s command. Our novel models based on graph-based sparse transformer and cross attention information fusion outperform previous systems adapted from the machine translation and graph generation literature. We further contribute our large graph modification datasets to the research community to encourage future research for this new problem.

pdf bib
History for Visual Dialog: Do we really need it?
Shubham Agarwal | Trung Bui | Joon-Young Lee | Ioannis Konstas | Verena Rieser
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Visual Dialogue involves “understanding” the dialogue history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to accurately generate the correct response. In this paper, we show that co-attention models which explicitly encode dialoh history outperform models that don’t, achieving state-of-the-art performance (72 % NDCG on val set). However, we also expose shortcomings of the crowdsourcing dataset collection procedure, by showing that dialogue history is indeed only required for a small amount of the data, and that the current evaluation metric encourages generic replies. To that end, we propose a challenging subset (VisdialConv) of the VisdialVal set and the benchmark NDCG of 63%.

pdf bib
Efficient Deployment of Conversational Natural Language Interfaces over Databases
Anthony Colas | Trung Bui | Franck Dernoncourt | Moumita Sinha | Doo Soon Kim
Proceedings of the First Workshop on Natural Language Interfaces

Many users communicate with chatbots and AI assistants in order to help them with various tasks. A key component of the assistant is the ability to understand and answer a user’s natural language questions for question-answering (QA). Because data can be usually stored in a structured manner, an essential step involves turning a natural language question into its corresponding query language. However, in order to train most natural language-to-query-language state-of-the-art models, a large amount of training data is needed first. In most domains, this data is not available and collecting such datasets for various domains can be tedious and time-consuming. In this work, we propose a novel method for accelerating the training dataset collection for developing the natural language-to-query-language machine learning models. Our system allows one to generate conversational multi-term data, where multiple turns define a dialogue session, enabling one to better utilize chatbot interfaces. We train two current state-of-the-art NL-to-QL models, on both an SQL and SPARQL-based datasets in order to showcase the adaptability and efficacy of our created data.

pdf bib
ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT
Hwanhee Lee | Seunghyun Yoon | Franck Dernoncourt | Doo Soon Kim | Trung Bui | Kyomin Jung
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems

In this paper, we propose an evaluation metric for image captioning systems using both image and text information. Unlike the previous methods that rely on textual representations in evaluating the caption, our approach uses visiolinguistic representations. The proposed method generates image-conditioned embeddings for each token using ViLBERT from both generated and reference texts. Then, these contextual embeddings from each of the two sentence-pair are compared to compute the similarity score. Experimental results on three benchmark datasets show that our method correlates significantly better with human judgments than all existing metrics.

pdf bib
Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation
Kang Min Yoo | Hanbit Lee | Franck Dernoncourt | Trung Bui | Walter Chang | Sang-goo Lee
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Recent works have shown that generative data augmentation, where synthetic samples generated from deep generative models complement the training dataset, benefit NLP tasks. In this work, we extend this approach to the task of dialog state tracking for goaloriented dialogs. Due to the inherent hierarchical structure of goal-oriented dialogs over utterances and related annotations, the deep generative model must be capable of capturing the coherence among different hierarchies and types of dialog features. We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs, including linguistic features and underlying structured annotations, namely speaker information, dialog acts, and goals. The proposed architecture is designed to model each aspect of goal-oriented dialogs using inter-connected latent variables and learns to generate coherent goal-oriented dialogs from the latent spaces. To overcome training issues that arise from training complex variational models, we propose appropriate training strategies. Experiments on various dialog datasets show that our model improves the downstream dialog trackers’ robustness via generative data augmentation. We also discover additional benefits of our unified approach to modeling goal-oriented dialogs – dialog response generation and user simulation, where our model outperforms previous strong baselines.

pdf bib
AutoNLU: An On-demand Cloud-based Natural Language Understanding System for Enterprises
Nham Le | Tuan Lai | Trung Bui | Doo Soon Kim
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations

With the renaissance of deep learning, neural networks have achieved promising results on many natural language understanding (NLU) tasks. Even though the source codes of many neural network models are publicly available, there is still a large gap from open-sourced models to solving real-world problems in enterprises. Therefore, to fill this gap, we introduce AutoNLU, an on-demand cloud-based system with an easy-to-use interface that covers all common use-cases and steps in developing an NLU model. AutoNLU has supported many product teams within Adobe with different use-cases and datasets, quickly delivering them working models. To demonstrate the effectiveness of AutoNLU, we present two case studies. i) We build a practical NLU model for handling various image-editing requests in Photoshop. ii) We build powerful keyphrase extraction models that achieve state-of-the-art results on two public benchmarks. In both cases, end users only need to write a small amount of code to convert their datasets into a common format used by AutoNLU.

pdf bib
ISA: An Intelligent Shopping Assistant
Tuan Lai | Trung Bui | Nedim Lipka
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations

Despite the growth of e-commerce, brick-and-mortar stores are still the preferred destinations for many people. In this paper, we present ISA, a mobile-based intelligent shopping assistant that is designed to improve shopping experience in physical stores. ISA assists users by leveraging advanced techniques in computer vision, speech processing, and natural language processing. An in-store user only needs to take a picture or scan the barcode of the product of interest, and then the user can talk to the assistant about the product. The assistant can also guide the user through the purchase process or recommend other similar products to the user. We take a data-driven approach in building the engines of ISA’s natural language processing component, and the engines achieve good performance.

2019

pdf bib
A Gated Self-attention Memory Network for Answer Selection
Tuan Lai | Quan Hung Tran | Trung Bui | Daisuke Kihara
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Answer selection is an important research problem, with applications in many areas. Previous deep learning based approaches for the task mainly adopt the Compare-Aggregate architecture that performs word-level comparison followed by aggregation. In this work, we take a departure from the popular Compare-Aggregate architecture, and instead, propose a new gated self-attention memory network for the task. Combined with a simple transfer learning technique from a large-scale online corpus, our model outperforms previous methods by a large margin, achieving new state-of-the-art results on two standard answer selection datasets: TrecQA and WikiQA.

pdf bib
Expressing Visual Relationships via Language
Hao Tan | Franck Dernoncourt | Zhe Lin | Trung Bui | Mohit Bansal
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Describing images with text is a fundamental problem in vision-language research. Current studies in this domain mostly focus on single image captioning. However, in various real applications (e.g., image editing, difference interpretation, and retrieval), generating relational captions for two images, can also be very useful. This important problem has not been explored mostly due to lack of datasets and effective models. To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions. We then propose a new relational speaker model based on an encoder-decoder architecture with static relational attention and sequential multi-head attention. We also extend the model with dynamic relational attention, which calculates visual alignment while decoding. Our models are evaluated on our newly collected and two public datasets consisting of image pairs annotated with relationship sentences. Experimental results, based on both automatic and human evaluation, demonstrate that our model outperforms all baselines and existing methods on all the datasets.

2018

pdf bib
A Review on Deep Learning Techniques Applied to Answer Selection
Tuan Manh Lai | Trung Bui | Sheng Li
Proceedings of the 27th International Conference on Computational Linguistics

Given a question and a set of candidate answers, answer selection is the task of identifying which of the candidates answers the question correctly. It is an important problem in natural language processing, with applications in many areas. Recently, many deep learning based methods have been proposed for the task. They produce impressive performance without relying on any feature engineering or expensive external resources. In this paper, we aim to provide a comprehensive review on deep learning methods applied to answer selection.

pdf bib
The Context-Dependent Additive Recurrent Neural Net
Quan Hung Tran | Tuan Lai | Gholamreza Haffari | Ingrid Zukerman | Trung Bui | Hung Bui
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Contextual sequence mapping is one of the fundamental problems in Natural Language Processing (NLP). Here, instead of relying solely on the information presented in the text, the learning agents have access to a strong external signal given to assist the learning process. In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to address this type of problem. The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (Switchboard and Penn Tree Bank) and question answering (Trec QA) show that our novel CARNN-based architectures outperform previous methods.

pdf bib
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Arman Cohan | Franck Dernoncourt | Doo Soon Kim | Trung Bui | Seokhwan Kim | Walter Chang | Nazli Goharian
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary. Empirical results on two large-scale datasets of scientific papers show that our model significantly outperforms state-of-the-art models.

pdf bib
PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering
Andrei Dulceanu | Thang Le Dinh | Walter Chang | Trung Bui | Doo Soon Kim | Manh Chien Vu | Seokhwan Kim
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
Ramesh Manuvinakurike | Jacqueline Brixey | Trung Bui | Walter Chang | Doo Soon Kim | Ron Artstein | Kallirroi Georgila
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
A Simple End-to-End Question Answering Model for Product Information
Tuan Lai | Trung Bui | Sheng Li | Nedim Lipka
Proceedings of the First Workshop on Economics and Natural Language Processing

When evaluating a potential product purchase, customers may have many questions in mind. They want to get adequate information to determine whether the product of interest is worth their money. In this paper we present a simple deep learning model for answering questions regarding product facts and specifications. Given a question and a product specification, the model outputs a score indicating their relevance. To train and evaluate our proposed model, we collected a dataset of 7,119 questions that are related to 153 different products. Experimental results demonstrate that –despite its simplicity– the performance of our model is shown to be comparable to a more complex state-of-the-art baseline.

pdf bib
DialEdit: Annotations for Spoken Conversational Image Editing
Ramesh Manuvirakurike | Jacqueline Brixey | Trung Bui | Walter Chang | Ron Artstein | Kallirroi Georgila
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation

pdf bib
Conversational Image Editing: Incremental Intent Identification in a New Dialogue Task
Ramesh Manuvinakurike | Trung Bui | Walter Chang | Kallirroi Georgila
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We present “conversational image editing”, a novel real-world application domain combining dialogue, visual information, and the use of computer vision. We discuss the importance of dialogue incrementality in this task, and build various models for incremental intent identification based on deep learning and traditional classification algorithms. We show how our model based on convolutional neural networks outperforms models based on random forests, long short term memory networks, and conditional random fields. By training embeddings based on image-related dialogue corpora, we outperform pre-trained out-of-the-box embeddings, for intention identification tasks. Our experiments also provide evidence that incremental intent processing may be more efficient for the user and could save time in accomplishing tasks.

2010

pdf bib
Decision Detection Using Hierarchical Graphical Models
Trung H. Bui | Stanley Peters
Proceedings of the ACL 2010 Conference Short Papers

2009

pdf bib
Real-time decision detection in multi-party dialogue
Matthew Frampton | Jia Huang | Trung Bui | Stanley Peters
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Extracting Decisions from Multi-Party Dialogue Using Directed Graphical Models and Semantic Similarity
Trung Bui | Matthew Frampton | John Dowding | Stanley Peters
Proceedings of the SIGDIAL 2009 Conference

2007

pdf bib
Practical Dialogue Manager Development using POMDPs
Trung H. Bui | Boris van Schooten | Dennis Hofs
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue