Ciprian Chelba


2019

pdf bib
Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation
Wei Wang | Isaac Caswell | Ciprian Chelba
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Noise and domain are important aspects of data quality for neural machine translation. Existing research focus separately on domain-data selection, clean-data selection, or their static combination, leaving the dynamic interaction across them not explicitly examined. This paper introduces a “co-curricular learning” method to compose dynamic domain-data selection with dynamic clean-data selection, for transfer learning across both capabilities. We apply an EM-style optimization procedure to further refine the “co-curriculum”. Experiment results and analysis with two domains demonstrate the effectiveness of the method and the properties of data scheduled by the co-curriculum.

pdf bib
Tagged Back-Translation
Isaac Caswell | Ciprian Chelba | David Grangier
Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data. We show that the main role of such synthetic noise is not to diversify the source side, as previously suggested, but simply to indicate to the model that the given source is synthetic. We propose a simpler alternative to noising techniques, consisting of tagging back-translated source sentences with an extra token. Our results on WMT outperform noised back-translation in English-Romanian and match performance on English-German, redefining the state-of-the-art on the former.

2018

pdf bib
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
Wei Wang | Taro Watanabe | Macduff Hughes | Tetsuji Nakagawa | Ciprian Chelba
Proceedings of the Third Conference on Machine Translation: Research Papers

Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet. Denoising is concerned with a different type of data quality and tries to reduce the negative impact of data noise on MT training, in particular, neural MT (NMT) training. This paper generalizes methods for measuring and selecting data for domain MT and applies them to denoising NMT training. The proposed approach uses trusted data and a denoising curriculum realized by online data selection. Intrinsic and extrinsic evaluations of the approach show its significant effectiveness for NMT to train on data with severe noise.

2016

pdf bib
Sparse Non-negative Matrix Language Modeling
Joris Pelemans | Noam Shazeer | Ciprian Chelba
Transactions of the Association for Computational Linguistics, Volume 4

We present Sparse Non-negative Matrix (SNM) estimation, a novel probability estimation technique for language modeling that can efficiently incorporate arbitrary features. We evaluate SNM language models on two corpora: the One Billion Word Benchmark and a subset of the LDC English Gigaword corpus. Results show that SNM language models trained with n-gram features are a close match for the well-established Kneser-Ney models. The addition of skip-gram features yields a model that is in the same league as the state-of-the-art recurrent neural network language models, as well as complementary: combining the two modeling techniques yields the best known result on the One Billion Word Benchmark. On the Gigaword corpus further improvements are observed using features that cross sentence boundaries. The computational advantages of SNM estimation over both maximum entropy and neural network estimation are probably its main strength, promising an approach that has large flexibility in combining arbitrary features and yet scales gracefully to large amounts of data.

2012

pdf bib
Large-scale discriminative language model reranking for voice-search
Preethi Jyothi | Leif Johnson | Ciprian Chelba | Brian Strope
Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT

2010

pdf bib
Model Combination for Machine Translation
John DeNero | Shankar Kumar | Ciprian Chelba | Franz Och
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf bib
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts
Ciprian Chelba | Paul Kantor | Brian Roark
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts

2006

pdf bib
Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures
Zheng-Yu Zhou | Peng Yu | Ciprian Chelba | Frank Seide
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Automatic Spoken Document Processing for Retrieval and Browsing
Ciprian Chelba | T. J. Hazen
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts

2005

pdf bib
Position Specific Posterior Lattices for Indexing Speech
Ciprian Chelba | Alex Acero
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
SPEECH OGLE: Indexing Uncertainty for Spoken Document Search
Ciprian Chelba | Alex Acero
Proceedings of the ACL Interactive Poster and Demonstration Sessions

2004

pdf bib
Parsing Conversational Speech Using Enhanced Segmentation
Jeremy G. Kahn | Mari Ostendorf | Ciprian Chelba
Proceedings of HLT-NAACL 2004: Short Papers

pdf bib
Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo
Ciprian Chelba | Alex Acero
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2002

pdf bib
A Study on Richer Syntactic Dependencies for Structured Language Modeling
Peng Xu | Ciprian Chelba | Frederick Jelinek
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2001

pdf bib
Information Extraction Using the Structured Language Model
Ciprian Chelba | Milind Mahajan
Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing

1998

pdf bib
Exploiting Syntactic Structure for Language Modeling
Ciprian Chelba | Frederick Jelinek
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Exploiting Syntactic Structure for Language Modeling
Ciprian Chelba | Frederick Jelinek
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

1997

pdf bib
A Structured Language Model
Ciprian Chelba
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics