Kenji Sagae

2021

pdf bib abs
Language Embeddings for Typology and Cross-lingual Transfer Learning
Dian Yu | Taiqi He | Kenji Sagae
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Cross-lingual language tasks typically require a substantial amount of annotated data or parallel translation data. We explore whether language representations that capture relationships among languages can be learned and subsequently leveraged in cross-lingual tasks without the use of parallel data. We generate dense embeddings for 29 languages using a denoising autoencoder, and evaluate the embeddings using the World Atlas of Language Structures (WALS) and two extrinsic tasks in a zero-shot setting: cross-lingual dependency parsing and cross-lingual natural language inference.

pdf bib abs
Automatically Exposing Problems with Neural Dialog Models
Dian Yu | Kenji Sagae
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Neural dialog models are known to suffer from problems such as generating unsafe and inconsistent responses. Even though these problems are crucial and prevalent, they are mostly manually identified by model designers through interactions. Recently, some research instructs crowdworkers to goad the bots into triggering such problems. However, humans leverage superficial clues such as hate speech, while leaving systematic problems undercover. In this paper, we propose two methods including reinforcement learning to automatically trigger a dialog model into generating problematic responses. We show the effect of our methods in exposing safety and contradiction issues with state-of-the-art dialog models.

pdf bib abs
Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Dian Yu | Zhou Yu | Kenji Sagae
Findings of the Association for Computational Linguistics: EMNLP 2021

Large language models benefit from training with a large amount of unlabeled text, which gives them increasingly fluent and diverse generation capabilities. However, using these models for text generation that takes into account target attributes, such as sentiment polarity or specific topics, remains a challenge. We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters. We evaluate our method on sentiment- and topic-controlled generation, and show large performance gains over previous methods while retaining fluency and diversity.

pdf bib
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)
Stephan Oepen | Kenji Sagae | Reut Tsarfaty | Gosse Bouma | Djamé Seddah | Daniel Zeman
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)

2020

pdf bib abs
Tracking the Evolution of Written Language Competence in L2 Spanish Learners
Alessio Miaschi | Sam Davidson | Dominique Brunato | Felice Dell’Orletta | Kenji Sagae | Claudia Helena Sanchez-Gutierrez | Giulia Venturi
Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

In this paper we present an NLP-based approach for tracking the evolution of written language competence in L2 Spanish learners using a wide range of linguistic features automatically extracted from students’ written productions. Beyond reporting classification results for different scenarios, we explore the connection between the most predictive features and the teaching curriculum, finding that our set of linguistic features often reflect the explicit instructions that students receive during each course.

pdf bib
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies
Gosse Bouma | Yuji Matsumoto | Stephan Oepen | Kenji Sagae | Djamé Seddah | Weiwei Sun | Anders Søgaard | Reut Tsarfaty | Dan Zeman
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

pdf bib abs
Developing NLP Tools with a New Corpus of Learner Spanish
Sam Davidson | Aaron Yamada | Paloma Fernandez Mira | Agustina Carando | Claudia H. Sanchez Gutierrez | Kenji Sagae
Proceedings of the Twelfth Language Resources and Evaluation Conference

The development of effective NLP tools for the L2 classroom depends largely on the availability of large annotated corpora of language learner text. While annotated learner corpora of English are widely available, large learner corpora of Spanish are less common. Those Spanish corpora that are available do not contain the annotations needed to facilitate the development of tools beneficial to language learners, such as grammatical error correction. As a result, the field has seen little research in NLP tools designed to benefit Spanish language learners and teachers. We introduce COWS-L2H, a freely available corpus of Spanish learner data which includes error annotations and parallel corrected text to help researchers better understand L2 development, to examine teaching practices empirically, and to develop NLP tools to better serve the Spanish teaching community. We demonstrate the utility of this corpus by developing a neural-network based grammatical error correction system for Spanish learner writing.

2019

pdf bib abs
UC Davis at SemEval-2019 Task 1: DAG Semantic Parsing with Attention-based Decoder
Dian Yu | Kenji Sagae
Proceedings of the 13th International Workshop on Semantic Evaluation

We present an encoder-decoder model for semantic parsing with UCCA SemEval 2019 Task 1. The encoder is a Bi-LSTM and the decoder uses recursive self-attention. The proposed model alleviates challenges and feature engineering in traditional transition-based and graph-based parsers. The resulting parser is simple and proved to effective on the semantic parsing task.

2017

pdf bib
Proceedings of the 15th International Conference on Parsing Technologies
Yusuke Miyao | Kenji Sagae
Proceedings of the 15th International Conference on Parsing Technologies

2016

pdf bib
Supertagging With LSTMs
Ashish Vaswani | Yonatan Bisk | Kenji Sagae | Ryan Musa
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib abs
Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States
Ashish Vaswani | Kenji Sagae
Transactions of the Association for Computational Linguistics, Volume 4

Transition-based approaches based on local classification are attractive for dependency parsing due to their simplicity and speed, despite producing results slightly below the state-of-the-art. In this paper, we propose a new approach for approximate structured inference for transition-based parsing that produces scores suitable for global scoring using local models. This is accomplished with the introduction of error states in local training, which add information about incorrect derivation paths typically left out completely in locally-trained models. Using neural networks for our local classifiers, our approach achieves 93.61% accuracy for transition-based dependency parsing in English.

2015

pdf bib
Combining Distributed Vector Representations for Words
Justin Garten | Kenji Sagae | Volkan Ustun | Morteza Dehghani
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

2014

pdf bib
Data-driven Measurement of Child Language Development with Simple Syntactic Templates
Shannon Lubetich | Kenji Sagae
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Improving Classification-Based Natural Language Understanding with Non-Expert Annotation
Fabrizio Morbini | Eric Forbell | Kenji Sagae
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Verbal Behaviors and Persuasiveness in Online Multimedia Content
Moitreya Chatterjee | Sunghyun Park | Han Suk Shim | Kenji Sagae | Louis-Philippe Morency
Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP)

2013

2012

pdf bib abs
Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems
Kallirroi Georgila | Alan Black | Kenji Sagae | David Traum
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The current practice in virtual human dialogue systems is to use professional human recordings or limited-domain speech synthesis. Both approaches lead to good performance but at a high cost. To determine the best trade-off between performance and cost, we perform a systematic evaluation of human and synthesized voices with regard to naturalness, conversational aspect, and likability. We vary the type (in-domain vs. out-of-domain), length, and content of utterances, and take into account the age and native language of raters as well as their familiarity with speech synthesis. We present detailed results from two studies, a pilot one and one run on Amazon's Mechanical Turk. Our results suggest that a professional human voice can supersede both an amateur human voice and synthesized voices. Also, a high-quality general-purpose voice or a good limited-domain voice can perform better than amateur human recordings. We do not find any significant differences between the performance of a high-quality general-purpose voice and a limited-domain voice, both trained with speech recorded by actors. As expected, the high-quality general-purpose voice is rated higher than the limited-domain voice for out-of-domain sentences and lower for in-domain sentences. There is also a trend for long or negative-content utterances to receive lower ratings.

pdf bib
A Mixed-Initiative Conversational Dialogue System for Healthcare
Fabrizio Morbini | Eric Forbell | David DeVault | Kenji Sagae | David Traum | Albert Rizzo
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2011

pdf bib
An Evaluation of Alternative Strategies for Implementing Dialogue Policies Using Statistical Classification and Hand-Authored Rules
David DeVault | Anton Leuski | Kenji Sagae
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Joint Identification and Segmentation of Domain-Specific Dialogue Acts for Conversational Dialogue Systems
Fabrizio Morbini | Kenji Sagae
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Toward Learning and Evaluation of Dialogue Policies with Text Examples
David DeVault | Anton Leuski | Kenji Sagae
Proceedings of the SIGDIAL 2011 Conference

2010

pdf bib
Latent Mixture of Discriminative Experts for Multimodal Prediction Modeling
Derya Ozkan | Kenji Sagae | Louis-Philippe Morency
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib abs
Practical Evaluation of Speech Recognizers for Virtual Human Dialogue Systems
Xuchen Yao | Pravin Bhutada | Kallirroi Georgila | Kenji Sagae | Ron Artstein | David Traum
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We perform a large-scale evaluation of multiple off-the-shelf speech recognizers across diverse domains for virtual human dialogue systems. Our evaluation is aimed at speech recognition consumers and potential consumers with limited experience with readily available recognizers. We focus on practical factors to determine what levels of performance can be expected from different available recognizers in various projects featuring different types of conversational utterances. Our results show that there is no single recognizer that outperforms all other recognizers in all domains. The performance of each recognizer may vary significantly depending on the domain, the size and perplexity of the corpus, the out-of-vocabulary rate, and whether acoustic and language model adaptation has been used or not. We expect that our evaluation will prove useful to other speech recognition consumers, especially in the dialogue community, and will shed some light on the key problem in spoken dialogue systems of selecting the most suitable available speech recognition system for a particular application, and what impact training will have.

pdf bib
Interpretation of Partial Utterances in Virtual Human Dialogue Systems
Kenji Sagae | David DeVault | David Traum
Proceedings of the NAACL HLT 2010 Demonstration Session

pdf bib
Dynamic Programming for Linear-Time Incremental Parsing
Liang Huang | Kenji Sagae
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Open-domain Commonsense Reasoning Using Discourse Relations from a Corpus of Weblog Stories
Matthew Gerber | Andrew Gordon | Kenji Sagae
Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading

pdf bib
Self-Training without Reranking for Parser Domain Adaptation and Its Impact on Semantic Role Labeling
Kenji Sagae
Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing

2009

pdf bib
Towards Natural Language Understanding of Partial Speech Recognition Results in Dialogue Systems
Kenji Sagae | Gwen Christian | David DeVault | David Traum
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
Analysis of Discourse Structure with Syntactic Dependencies and Data-Driven Shift-Reduce Parsing
Kenji Sagae
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
Clustering Words by Syntactic Similarity improves Dependency Parsing of Predicate-argument Structures
Kenji Sagae | Andrew S. Gordon
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
Can I Finish? Learning When to Respond to Incremental Interpretation Results in Interactive Dialogue
David DeVault | Kenji Sagae | David Traum
Proceedings of the SIGDIAL 2009 Conference

2008

pdf bib
Shift-Reduce Dependency DAG Parsing
Kenji Sagae | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib abs
GENIA-GR: a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain
Yuka Tateisi | Yusuke Miyao | Kenji Sagae | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent relations using the grammatical relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.

pdf bib
Task-oriented Evaluation of Syntactic Parsers and Their Representations
Yusuke Miyao | Rune Sætre | Kenji Sagae | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of ACL-08: HLT

pdf bib
Evaluating the Effects of Treebank Size in a Practical Application for Parsing
Kenji Sagae | Yusuke Miyao | Rune Saetre | Jun’ichi Tsujii
Software Engineering, Testing, and Quality Assurance for Natural Language Processing

Recent research has shown that a balanced harmonic mean (F1 measure) of unigram precision and recall outperforms the widely used BLEU and NIST metrics for Machine Translation evaluation in terms of correlation with human judgments of translation quality. We show that significantly better correlations can be achieved by placing more weight on recall than on precision. While this may seem unexpected, since BLEU and NIST focus on n-gram precision and disregard recall, our experiments show that correlation with human judgments is highest when almost all of the weight is assigned to recall. We also show that stemming is significantly beneficial not just to simpler unigram precision and recall based metrics, but also to BLEU and NIST.

pdf bib
Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs
Kenji Sagae | Brian MacWhinney | Alon Lavie
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib abs
Combining Rule-based and Data-driven Techniques for Grammatical Relation Extraction in Spoken Language
Kenji Sagae | Alon Lavie
Proceedings of the Eighth International Conference on Parsing Technologies

We investigate an aspect of the relationship between parsing and corpus-based methods in NLP that has received relatively little attention: coverage augmentation in rule-based parsers. In the specific task of determining grammatical relations (such as subjects and objects) in transcribed spoken language, we show that a combination of rule-based and corpus-based approaches, where a rule-based system is used as the teacher (or an automatic data annotator) to a corpus-based system, outperforms either system in isolation.