Ankit Chadha
2023
Controlled Text Generation with Hidden Representation Transformations
Vaibhav Kumar
|
Hana Koorehdavoudi
|
Masud Moshtaghi
|
Amita Misra
|
Ankit Chadha
|
Emilio Ferrara
Findings of the Association for Computational Linguistics: ACL 2023
We propose CHRT (Control HiddenRepresentation Transformation) – a con-trolled language generation framework thatsteers large language models to generatetext pertaining to certain attributes (such astoxicity). CHRT gains attribute control bymodifying the hidden representation of thebase model through learned transformations. We employ a contrastive-learning frameworkto learn these transformations that can becombined to gain multi-attribute control. Theeffectiveness of CHRT is experimentallyshown by comparing it with seven baselinesover three attributes. CHRT outperforms all thebaselines in the task of detoxification, positivesentiment steering, and text simplificationwhile minimizing the loss in linguistic qualities. Further, our approach has the lowest inferencelatency of only 0.01 seconds more than thebase model, making it the most suitable forhigh-performance production environments. We open-source our code and release two noveldatasets to further propel controlled languagegeneration research
Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages
Shivanshu Gupta
|
Yoshitomo Matsubara
|
Ankit Chadha
|
Alessandro Moschitti
Findings of the Association for Computational Linguistics: ACL 2023
While impressive performance has been achieved on the task of Answer Sentence Selection (AS2) for English, the same does not hold for languages that lack large labeled datasets. In this work, we propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages in the tasks without the need of labeled data for the target language. To evaluate our method, we introduce 1) Xtr-WikiQA, a translation-based WikiQA dataset for 9 additional languages, and 2) TyDi-AS2, a multilingual AS2 dataset with over 70K questions spanning 8 typologically diverse languages. We conduct extensive experiments on Xtr-WikiQA and TyDi-AS2 with multiple teachers, diverse monolingual and multilingual pretrained language models (PLMs) as students, and both monolingual and multilingual training. The results demonstrate that CLKD either outperforms or rivals even supervised fine-tuning with the same amount of labeled data and a combination of machine translation and the teacher model. Our method can potentially enable stronger AS2 models for low-resource languages, while TyDi-AS2 can serve as the largest multilingual AS2 dataset for further studies in the research community.
2022
Training Mixed-Domain Translation Models via Federated Learning
Peyman Passban
|
Tanya Roosta
|
Rahul Gupta
|
Ankit Chadha
|
Clement Chung
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Training mixed-domain translation models is a complex task that demands tailored architec- tures and costly data preparation techniques. In this work, we leverage federated learning (FL) in order to tackle the problem. Our investiga- tion demonstrates that with slight modifications in the training process, neural machine trans- lation (NMT) engines can be easily adapted when an FL-based aggregation is applied to fuse different domains. Experimental results also show that engines built via FL are able to perform on par with state-of-the-art baselines that rely on centralized training techniques. We evaluate our hypothesis in the presence of five datasets with different sizes, from different domains, to translate from German into English and discuss how FL and NMT can mutually benefit from each other. In addition to provid- ing benchmarking results on the union of FL and NMT, we also propose a novel technique to dynamically control the communication band- width by selecting impactful parameters during FL updates. This is a significant achievement considering the large size of NMT engines that need to be exchanged between FL parties.
Search
Fix data
Co-authors
- Clement Chung 1
- Emilio Ferrara 1
- Rahul Gupta 1
- Shivanshu Gupta 1
- Hana Koorehdavoudi 1
- show all...