Pascal Poupart


pdf bib
RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation
Md Akmal Haidar | Nithin Anchuri | Mehdi Rezagholizadeh | Abbas Ghaddar | Philippe Langlais | Pascal Poupart
Findings of the Association for Computational Linguistics: NAACL 2022

Intermediate layer knowledge distillation (KD) can improve the standard KD technique (which only targets the output of teacher and student models) especially over large pre-trained language models. However, intermediate layer distillation suffers from excessive computational burdens and engineering efforts required for setting up a proper layer mapping. To address these problems, we propose a RAndom Intermediate Layer Knowledge Distillation (RAIL-KD) approach in which, intermediate layers from the teacher model are selected randomly to be distilled into the intermediate layers of the student model. This randomized selection enforces that all teacher layers are taken into account in the training process, while reducing the computational cost of intermediate layer distillation. Also, we show that it acts as a regularizer for improving the generalizability of the student model. We perform extensive experiments on GLUE tasks as well as on out-of-domain test sets. We show that our proposed RAIL-KD approach outperforms other state-of-the-art intermediate layer KD methods considerably in both performance and training-time.

pdf bib
WatClaimCheck: A new Dataset for Claim Entailment and Inference
Kashif Khan | Ruizhe Wang | Pascal Poupart
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We contribute a new dataset for the task of automated fact checking and an evaluation of state of the art algorithms. The dataset includes claims (from speeches, interviews, social media and news articles), review articles published by professional fact checkers and premise articles used by those professional fact checkers to support their review and verify the veracity of the claims. An important challenge in the use of premise articles is the identification of relevant passages that will help to infer the veracity of a claim. We show that transferring a dense passage retrieval model trained with review articles improves the retrieval quality of passages in premise articles. We report results for the prediction of claim veracity by inference from premise articles.


pdf bib
Variational Attention for Sequence-to-Sequence Models
Hareesh Bahuleyan | Lili Mou | Olga Vechtomova | Pascal Poupart
Proceedings of the 27th International Conference on Computational Linguistics

The variational encoder-decoder (VED) encodes source information as a set of random variables using a neural network, which in turn is decoded into target data using another neural network. In natural language processing, sequence-to-sequence (Seq2Seq) models typically serve as encoder-decoder networks. When combined with a traditional (deterministic) attention mechanism, the variational latent space may be bypassed by the attention model, and thus becomes ineffective. In this paper, we propose a variational attention mechanism for VED, where the attention vector is also modeled as Gaussian distributed random variables. Results on two experiments show that, without loss of quality, our proposed method alleviates the bypassing phenomenon as it increases the diversity of generated sentences.


pdf bib
Deep Active Learning for Dialogue Generation
Nabiha Asghar | Pascal Poupart | Xin Jiang | Hang Li
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

We propose an online, end-to-end, neural generative conversational model for open-domain dialogue. It is trained using a unique combination of offline two-phase supervised learning and online human-in-the-loop active learning. While most existing research proposes offline supervision or hand-crafted reward functions for online reinforcement, we devise a novel interactive learning mechanism based on hamming-diverse beam search for response generation and one-character user-feedback at each step. Experiments show that our model inherently promotes the generation of semantically relevant and interesting responses, and can be used to train agents with customized personas, moods and conversational styles.


pdf bib
Overfitting at SemEval-2016 Task 3: Detecting Semantically Similar Questions in Community Question Answering Forums with Word Embeddings
Hujie Wang | Pascal Poupart
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)


pdf bib
Generating Lexical Analogies Using Dependency Relations
Andy Chiu | Pascal Poupart | Chrysanne DiMarco
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)


pdf bib
Partially Observable Markov Decision Processes with Continuous Observations for Dialogue Management
Jason D. Williams | Pascal Poupart | Steve Young
Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue