Masud Moshtaghi
2023
Controlled Text Generation with Hidden Representation Transformations
Vaibhav Kumar | Hana Koorehdavoudi | Masud Moshtaghi | Amita Misra | Ankit Chadha | Emilio Ferrara
Findings of the Association for Computational Linguistics: ACL 2023
Vaibhav Kumar | Hana Koorehdavoudi | Masud Moshtaghi | Amita Misra | Ankit Chadha | Emilio Ferrara
Findings of the Association for Computational Linguistics: ACL 2023
We propose CHRT (Control HiddenRepresentation Transformation) – a con-trolled language generation framework thatsteers large language models to generatetext pertaining to certain attributes (such astoxicity). CHRT gains attribute control bymodifying the hidden representation of thebase model through learned transformations. We employ a contrastive-learning frameworkto learn these transformations that can becombined to gain multi-attribute control. Theeffectiveness of CHRT is experimentallyshown by comparing it with seven baselinesover three attributes. CHRT outperforms all thebaselines in the task of detoxification, positivesentiment steering, and text simplificationwhile minimizing the loss in linguistic qualities. Further, our approach has the lowest inferencelatency of only 0.01 seconds more than thebase model, making it the most suitable forhigh-performance production environments. We open-source our code and release two noveldatasets to further propel controlled languagegeneration research
2019
Supervised and Nonlinear Alignment of Two Embedding Spaces for Dictionary Induction in Low Resourced Languages
Masud Moshtaghi
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Masud Moshtaghi
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Enabling cross-lingual NLP tasks by leveraging multilingual word embedding has recently attracted much attention. An important motivation is to support lower resourced languages, however, most efforts focus on demonstrating the effectiveness of the techniques using embeddings derived from similar languages to English with large parallel content. In this study, we first describe the general requirements for the success of these techniques and then present a noise tolerant piecewise linear technique to learn a non-linear mapping between two monolingual word embedding vector spaces. We evaluate our approach on inferring bilingual dictionaries. We show that our technique outperforms the state-of-the-art in lower resourced settings with an average of 3.7% improvement of precision @10 across 14 mostly low resourced languages.
2014
A Comparative Study of Weighting Schemes for the Interpretation of Spoken Referring Expressions
Su Nam Kim | Ingrid Zukerman | Thomas Kleinbauer | Masud Moshtaghi
Proceedings of the Australasian Language Technology Association Workshop 2014
Su Nam Kim | Ingrid Zukerman | Thomas Kleinbauer | Masud Moshtaghi
Proceedings of the Australasian Language Technology Association Workshop 2014