Long Duong

Also published as: Long Duong Thanh


2019

pdf bib
An adaptable task-oriented dialog system for stand-alone embedded devices
Long Duong | Vu Cong Duy Hoang | Tuyen Quang Pham | Yu-Heng Hong | Vladislavs Dovgalecs | Guy Bashkansky | Jason Black | Andrew Bleeker | Serge Le Huitouze | Mark Johnson
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

This paper describes a spoken-language end-to-end task-oriented dialogue system for small embedded devices such as home appliances. While the current system implements a smart alarm clock with advanced calendar scheduling functionality, the system is designed to make it easy to port to other application domains (e.g., the dialogue component factors out domain-specific execution from domain-general actions such as requesting and updating slot values). The system does not require internet connectivity because all components, including speech recognition, natural language understanding, dialogue management, execution and text-to-speech, run locally on the embedded device (our demo uses a Raspberry Pi). This simplifies deployment, minimizes server costs and most importantly, eliminates user privacy risks. The demo video in alarm domain is here youtu.be/N3IBMGocvHU

2018

pdf bib
Active learning for deep semantic parsing
Long Duong | Hadi Afshar | Dominique Estival | Glen Pink | Philip Cohen | Mark Johnson
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Semantic parsing requires training data that is expensive and slow to collect. We apply active learning to both traditional and “overnight” data collection approaches. We show that it is possible to obtain good training hyperparameters from seed data which is only a small fraction of the full dataset. We show that uncertainty sampling based on least confidence score is competitive in traditional data collection but not applicable for overnight collection. We propose several active learning strategies for overnight data collection and show that different example selection strategies per domain perform best.

2017

pdf bib
Multilingual Semantic Parsing And Code-Switching
Long Duong | Hadi Afshar | Dominique Estival | Glen Pink | Philip Cohen | Mark Johnson
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Extending semantic parsing systems to new domains and languages is a highly expensive, time-consuming process, so making effective use of existing resources is critical. In this paper, we describe a transfer learning method using crosslingual word embeddings in a sequence-to-sequence model. On the NLmaps corpus, our approach achieves state-of-the-art accuracy of 85.7% for English. Most importantly, we observed a consistent improvement for German compared with several baseline domain adaptation techniques. As a by-product of this approach, our models that are trained on a combination of English and German utterances perform reasonably well on code-switching utterances which contain a mixture of English and German, even though the training data does not contain any such. As far as we know, this is the first study of code-switching in semantic parsing. We manually constructed the set of code-switching test utterances for the NLmaps corpus and achieve 78.3% accuracy on this dataset.

pdf bib
Multilingual Training of Crosslingual Word Embeddings
Long Duong | Hiroshi Kanayama | Tengfei Ma | Steven Bird | Trevor Cohn
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine strength of many languages. We obtained high performance on bilingual lexicon induction, monolingual similarity and crosslingual document classification tasks.

2016

pdf bib
UniMelb at SemEval-2016 Task 3: Identifying Similar Questions by combining a CNN with String Similarity Measures
Timothy Baldwin | Huizhi Liang | Bahar Salehi | Doris Hoogeveen | Yitong Li | Long Duong
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages
Antonios Anastasopoulos | David Chiang | Long Duong
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Crosslingual Word Embeddings without Bilingual Corpora
Long Duong | Hiroshi Kanayama | Tengfei Ma | Steven Bird | Trevor Cohn
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
An Attentional Model for Speech Translation Without Transcription
Long Duong | Antonios Anastasopoulos | David Chiang | Steven Bird | Trevor Cohn
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2015

pdf bib
Cross-lingual Transfer for Unsupervised Dependency Parsing Without Parallel Data
Long Duong | Trevor Cohn | Steven Bird | Paul Cook
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

pdf bib
A Neural Network Model for Low-Resource Universal Dependency Parsing
Long Duong | Trevor Cohn | Steven Bird | Paul Cook
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser
Long Duong | Trevor Cohn | Steven Bird | Paul Cook
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Automatic Identification of Expressions of Locations in Tweet Messages using Conditional Random Fields
Fei Liu | Afshin Rahimi | Bahar Salehi | Miji Choi | Ping Tan | Long Duong
Proceedings of the Australasian Language Technology Association Workshop 2014

pdf bib
Exploring Methods and Resources for Discriminating Similar Languages
Marco Lui | Ned Letcher | Oliver Adams | Long Duong | Paul Cook | Timothy Baldwin
Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects

pdf bib
What Can We Get From 1000 Tokens? A Case Study of Multilingual POS Tagging For Resource-Poor Languages
Long Duong | Trevor Cohn | Karin Verspoor | Steven Bird | Paul Cook
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Simpler unsupervised POS tagging with bilingual projections
Long Duong | Paul Cook | Steven Bird | Pavel Pecina
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Increasing the Quality and Quantity of Source Language Data for Unsupervised Cross-Lingual POS Tagging
Long Duong | Paul Cook | Steven Bird | Pavel Pecina
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Automatic sentence classifier using sentence ordering features for Event Based Medicine: Shared task system description
Spandana Gella | Long Duong Thanh
Proceedings of the Australasian Language Technology Association Workshop 2012