Kaushik Ram Sadagopan


2023

pdf bib
Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation
Jean Maillard | Cynthia Gao | Elahe Kalbassi | Kaushik Ram Sadagopan | Vedanuj Goswami | Philipp Koehn | Angela Fan | Francisco Guzman
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

For many languages, machine translation progress is hindered by the lack of reliable training data. Models are trained on whatever pre-existing datasets may be available and then augmented with synthetic data, because it is often not economical to pay for the creation of large-scale datasets. But for the case of low-resource languages, would the creation of a few thousand professionally translated sentence pairs give any benefit? In this paper, we show that it does. We describe a broad data collection effort involving around 6k professionally translated sentence pairs for each of 39 low-resource languages, which we make publicly available. We analyse the gains of models trained on this small but high-quality data, showing that it has significant impact even when larger but lower quality pre-existing corpora are used, or when data is augmented with millions of sentences through backtranslation.

2022

pdf bib
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent
Ethan A. Chi | Ashwin Paranjape | Abigail See | Caleb Chiam | Trenton Chang | Kathleen Kenealy | Swee Kiat Lim | Amelia Hardy | Chetanya Rastogi | Haojun Li | Alexander Iyabor | Yutong He | Hari Sowrirajan | Peng Qi | Kaushik Ram Sadagopan | Nguyet Minh Phu | Dilara Soylu | Jillian Tang | Avanika Narayan | Giovanni Campagna | Christopher Manning
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

We present Chirpy Cardinal, an open-domain social chatbot. Aiming to be both informative and conversational, our bot chats with users in an authentic, emotionally intelligent way. By integrating controlled neural generation with scaffolded, hand-written dialogue, we let both the user and bot take turns driving the conversation, producing an engaging and socially fluent experience. Deployed in the fourth iteration of the Alexa Prize Socialbot Grand Challenge, Chirpy Cardinal handled thousands of conversations per day, placing second out of nine bots with an average user rating of 3.58/5.