Chris Brust
2020
Simultaneous Translation and Paraphrase for Language Education
Stephen Mayhew
|
Klinton Bicknell
|
Chris Brust
|
Bill McDowell
|
Will Monroe
|
Burr Settles
Proceedings of the Fourth Workshop on Neural Generation and Translation
We present the task of Simultaneous Translation and Paraphrasing for Language Education (STAPLE). Given a prompt in one language, the goal is to generate a diverse set of correct translations that language learners are likely to produce. This is motivated by the need to create and maintain large, high-quality sets of acceptable translations for exercises in a language-learning application, and synthesizes work spanning machine translation, MT evaluation, automatic paraphrasing, and language education technology. We developed a novel corpus with unique properties for five languages (Hungarian, Japanese, Korean, Portuguese, and Vietnamese), and report on the results of a shared task challenge which attracted 20 teams to solve the task. In our meta-analysis, we focus on three aspects of the resulting systems: external training corpus selection, model architecture and training decisions, and decoding and filtering strategies. We find that strong systems start with a large amount of generic training data, and then fine-tune with in-domain data, sampled according to our provided learner response frequencies.
2018
Second Language Acquisition Modeling
Burr Settles
|
Chris Brust
|
Erin Gustafson
|
Masato Hagiwara
|
Nitin Madnani
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
We present the task of second language acquisition (SLA) modeling. Given a history of errors made by learners of a second language, the task is to predict errors that they are likely to make at arbitrary points in the future. We describe a large corpus of more than 7M words produced by more than 6k learners of English, Spanish, and French using Duolingo, a popular online language-learning app. Then we report on the results of a shared task challenge aimed studying the SLA task via this corpus, which attracted 15 teams and synthesized work from various fields including cognitive science, linguistics, and machine learning.
Search
Co-authors
- Burr Settles 2
- Stephen Mayhew 1
- Klinton Bicknell 1
- Bill McDowell 1
- Will Monroe 1
- show all...