Sarath Chandar


2022

pdf bib
Local Structure Matters Most: Perturbation Study in NLU
Louis Clouatre | Prasanna Parthasarathi | Amal Zouaq | Sarath Chandar
Findings of the Association for Computational Linguistics: ACL 2022

Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.

pdf bib
Local Structure Matters Most in Most Languages
Louis Clouatre | Prasanna Parthasarathi | Amal Zouaq | Sarath Chandar
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Many recent perturbation studies have found unintuitive results on what does and does not matter when performing Natural Language Understanding (NLU) tasks in English. Coding properties, such as the order of words, can often be removed through shuffling without impacting downstream performances. Such insight may be used to direct future research into English NLP models. As many improvements in multilingual settings consist of wholesale adaptation of English approaches, it is important to verify whether those studies replicate or not in multilingual settings. In this work, we replicate a study on the importance of local structure, and the relative unimportance of global structure, in a multilingual setting. We find that the phenomenon observed on the English language broadly translates to over 120 languages, with a few caveats.

2021

pdf bib
A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng | Varun Gangal | Jason Wei | Sarath Chandar | Soroush Vosoughi | Teruko Mitamura | Eduard Hovy
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
MLMLM: Link Prediction with Mean Likelihood Masked Language Model
Louis Clouatre | Philippe Trempe | Amal Zouaq | Sarath Chandar
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss
Prasanna Parthasarathi | Mohamed Abdelsalam | Sarath Chandar | Joelle Pineau
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training objectives like negative log-likelihood (NLL) or cross-entropy. Such commonly used training objectives do not foster generating alternate responses to a context. But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. We hypothesize that a language generation model can improve on its diversity by learning to generate alternate text during training and minimizing a semantic loss as an auxiliary objective. We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues. We make two observations (1) minimizing a semantic objective improved diversity in responses in the smaller data set (Frames) but only as-good-as minimizing the NLL in the larger data set (MultiWoZ) (2) large language model embeddings can be more useful as a semantic loss objective than as initialization for token embeddings.

pdf bib
Do Encoder Representations of Generative Dialogue Models have sufficient summary of the Information about the task ?
Prasanna Parthasarathi | Joelle Pineau | Sarath Chandar
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Predicting the next utterance in dialogue is contingent on encoding of users’ input text to generate appropriate and relevant response in data-driven approaches. Although the semantic and syntactic quality of the language generated is evaluated, more often than not, the encoded representation of input is not evaluated. As the representation of the encoder is essential for predicting the appropriate response, evaluation of encoder representation is a challenging yet important problem. In this work, we showcase evaluating the text generated through human or automatic metrics is not sufficient to appropriately evaluate soundness of the language understanding of dialogue models and, to that end, propose a set of probe tasks to evaluate encoder representation of different language encoders commonly used in dialogue models. From experiments, we observe that some of the probe tasks are easier and some are harder for even sophisticated model architectures to learn. And, through experiments we observe that RNN based architectures have lower performance on automatic metrics on text generation than transformer model but perform better than the transformer model on the probe tasks indicating that RNNs might preserve task information better than the Transformers.

2019

pdf bib
Structure Learning for Neural Module Networks
Vardaan Pahuja | Jie Fu | Sarath Chandar | Christopher Pal
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that involve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. We utilize a minimum amount of prior knowledge from the human-specified neural modules in the form of different input types and arithmetic operators used in these modules. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules. In addition, we do a analysis of sensitivity of the learned modules w.r.t. the arithmetic operations and infer the analytical expressions of the learned modules.

pdf bib
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study
Chinnadhurai Sankar | Sandeep Subramanian | Chris Pal | Sarath Chandar | Yoshua Bengio
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural generative models have been become increasingly popular when building conversational agents. They offer flexibility, can be easily adapted to new domains, and require minimal domain engineering. A common criticism of these systems is that they seldom understand or use the available dialog history effectively. In this paper, we take an empirical approach to understanding how these models use the available dialog history by studying the sensitivity of the models to artificially introduced unnatural changes or perturbations to their context at test time. We experiment with 10 different types of perturbations on 4 multi-turn dialog datasets and find that commonly used neural dialog architectures like recurrent and transformer-based seq2seq models are rarely sensitive to most perturbations such as missing or reordering utterances, shuffling words, etc. Also, by open-sourcing our code, we believe that it will serve as a useful diagnostic tool for evaluating dialog systems in the future.

pdf bib
Towards Lossless Encoding of Sentences
Gabriele Prato | Mathieu Duchesneau | Sarath Chandar | Alain Tapp
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

A lot of work has been done in the field of image compression via machine learning, but not much attention has been given to the compression of natural language. Compressing text into lossless representations while making features easily retrievable is not a trivial task, yet has huge benefits. Most methods designed to produce feature rich sentence embeddings focus solely on performing well on downstream tasks and are unable to properly reconstruct the original sequence from the learned embedding. In this work, we propose a near lossless method for encoding long sequences of texts as well as all of their sub-sequences into feature rich representations. We test our method on sentiment analysis and show good performance across all sub-sentence and sentence embeddings.

2017

pdf bib
Memory Augmented Neural Networks for Natural Language Processing
Caglar Gulcehre | Sarath Chandar
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Designing of general-purpose learning algorithms is a long-standing goal of artificial intelligence. A general purpose AI agent should be able to have a memory that it can store and retrieve information from. Despite the success of deep learning in particular with the introduction of LSTMs and GRUs to this area, there are still a set of complex tasks that can be challenging for conventional neural networks. Those tasks often require a neural network to be equipped with an explicit, external memory in which a larger, potentially unbounded, set of facts need to be stored. They include but are not limited to, reasoning, planning, episodic question-answering and learning compact algorithms. Recently two promising approaches based on neural networks to this type of tasks have been proposed: Memory Networks and Neural Turing Machines.In this tutorial, we will give an overview of this new paradigm of “neural networks with memory”. We will present a unified architecture for Memory Augmented Neural Networks (MANN) and discuss the ways in which one can address the external memory and hence read/write from it. Then we will introduce Neural Turing Machines and Memory Networks as specific instantiations of this general architecture. In the second half of the tutorial, we will focus on recent advances in MANN which focus on the following questions: How can we read/write from an extremely large memory in a scalable way? How can we design efficient non-linear addressing schemes? How can we do efficient reasoning using large scale memory and an episodic memory? The answer to any one of these questions introduces a variant of MANN. We will conclude the tutorial with several open challenges in MANN and its applications to NLP.We will introduce several applications of MANN in NLP throughout the tutorial. Few examples include language modeling, question answering, visual question answering, and dialogue systems.For updated information and material, please refer to our tutorial website: https://sites.google.com/view/mann-emnlp2017/.

2016

pdf bib
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus
Iulian Vlad Serban | Alberto García-Durán | Caglar Gulcehre | Sungjin Ahn | Sarath Chandar | Aaron Courville | Yoshua Bengio
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning
Janarthanan Rajendran | Mitesh M. Khapra | Sarath Chandar | Balaraman Ravindran
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Multilingual Multimodal Language Processing Using Neural Networks
Mitesh M Khapra | Sarath Chandar
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation
Amrita Saha | Mitesh M. Khapra | Sarath Chandar | Janarthanan Rajendran | Kyunghyun Cho
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures.