Yashar Mehdad


2021

pdf bib
FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Kushal Lakhotia | Bhargavi Paranjape | Asish Ghoshal | Scott Yih | Yashar Mehdad | Srini Iyer
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy on five tasks from the ERASER explainability benchmark in both fully supervised and few-shot settings.

pdf bib
Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations
Ana Valeria González | Gagan Bansal | Angela Fan | Yashar Mehdad | Robin Jia | Srinivasan Iyer
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark
Haoran Li | Abhinav Arora | Shuohui Chen | Anchit Gupta | Sonal Gupta | Yashar Mehdad
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Scaling semantic parsing models for task-oriented dialog systems to new languages is often expensive and time-consuming due to the lack of available datasets. Available datasets suffer from several shortcomings: a) they contain few languages b) they contain small amounts of labeled examples per language c) they are based on the simple intent and slot detection paradigm for non-compositional queries. In this paper, we present a new multilingual dataset, called MTOP, comprising of 100k annotated utterances in 6 languages across 11 domains. We use this dataset and other publicly available datasets to conduct a comprehensive benchmarking study on using various state-of-the-art multilingual pre-trained models for task-oriented semantic parsing. We achieve an average improvement of +6.3 points on Slot F1 for the two existing multilingual datasets, over best results reported in their experiments. Furthermore, we demonstrate strong zero-shot performance using pre-trained models combined with automatic translation and alignment, and a proposed distant supervision method to reduce the noise in slot label projection.

pdf bib
EASE: Extractive-Abstractive Summarization End-to-End using the Information Bottleneck Principle
Haoran Li | Arash Einolghozati | Srinivasan Iyer | Bhargavi Paranjape | Yashar Mehdad | Sonal Gupta | Marjan Ghazvininejad
Proceedings of the Third Workshop on New Frontiers in Summarization

Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability. Extractive summarization systems, though interpretable, suffer from redundancy and possible lack of coherence. To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework that generates concise abstractive summaries that can be traced back to an extractive summary. Our framework can be applied to any evidence-based text generation problem and can accommodate various pretrained models in its simple architecture. We use the Information Bottleneck principle to jointly train the extraction and abstraction in an end-to-end fashion. Inspired by previous research that humans use a two-stage framework to summarize long documents (Jing and McKeown, 2000), our framework first extracts a pre-defined amount of evidence spans and then generates a summary using only the evidence. Using automatic and human evaluations, we show that the generated summaries are better than strong extractive and extractive-abstractive baselines.

pdf bib
Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation
Alexander Fabbri | Simeng Han | Haoyuan Li | Haoran Li | Marjan Ghazvininejad | Shafiq Joty | Dragomir Radev | Yashar Mehdad
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner. WikiTransfer fine-tunes pretrained models on pseudo-summaries, produced from generic Wikipedia data, which contain characteristics of the target dataset, such as the length and level of abstraction of the desired summaries. WikiTransfer models achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional diverse datasets. These models are more robust to noisy data and also achieve better or comparable few-shot performance using 10 and 100 training examples when compared to few-shot transfer from other summarization datasets. To further boost performance, we employ data augmentation via round-trip translation as well as introduce a regularization term for improved few-shot transfer. To understand the role of dataset aspects in transfer performance and the quality of the resulting output summaries, we further study the effect of the components of our unsupervised fine-tuning data and analyze few-shot performance using both automatic and human evaluation.

pdf bib
RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering
Srinivasan Iyer | Sewon Min | Yashar Mehdad | Wen-tau Yih
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a successful re-ranking approach (RECONSIDER) for span-extraction tasks that improves upon the performance of MRC models, even beyond large-scale pre-training. RECONSIDER is trained on positive and negative examples extracted from high confidence MRC model predictions, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positives, achieving a new extractive state of the art on four QA tasks, with 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA. We will release all related data, models, and code.

pdf bib
Syntax-augmented Multilingual BERT for Cross-lingual Transfer
Wasi Ahmad | Haoran Li | Kai-Wei Chang | Yashar Mehdad
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. However, due to typological differences across languages, the cross-lingual transfer is challenging. Nevertheless, language syntax, e.g., syntactic dependencies, can bridge the typological gap. Previous works have shown that pre-trained multilingual encoders, such as mBERT (CITATION), capture language syntax, helping cross-lingual transfer. This work shows that explicitly providing language syntax and training mBERT using an auxiliary objective to encode the universal dependency tree structure helps cross-lingual transfer. We perform rigorous experiments on four NLP tasks, including text classification, question answering, named entity recognition, and task-oriented semantic parsing. The experiment results show that syntax-augmented mBERT improves cross-lingual transfer on popular benchmarks, such as PAWS-X and MLQA, by 1.4 and 1.6 points on average across all languages. In the generalized transfer setting, the performance boosted significantly, with 3.9 and 3.1 points on average in PAWS-X and MLQA.

pdf bib
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining
Alexander Fabbri | Faiaz Rahman | Imad Rizvi | Borui Wang | Haoran Li | Yashar Mehdad | Dragomir Radev
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues–viewpoints–assertions framework to crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads. We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data. To create a comprehensive benchmark, we also evaluate these models on widely-used conversation summarization datasets to establish strong baselines in this domain. Furthermore, we incorporate argument mining through graph construction to directly model the issues, viewpoints, and assertions present in a conversation and filter noisy input, showing comparable or improved results according to automatic and human evaluations.

2020

pdf bib
Conversational Semantic Parsing
Armen Aghajanyan | Jean Maillard | Akshat Shrivastava | Keith Diedrick | Michael Haeger | Haoran Li | Yashar Mehdad | Veselin Stoyanov | Anuj Kumar | Mike Lewis | Sonal Gupta
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The structured representation for semantic parsing in task-oriented assistant systems is geared towards simple understanding of one-turn queries. Due to the limitations of the representation, the session-based properties such as co-reference resolution and context carryover are processed downstream in a pipelined system. In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session. We release a new session-based, compositional task-oriented parsing dataset of 20k sessions consisting of 60k utterances. Unlike Dialog State Tracking Challenges, the queries in the dataset have compositional forms. We propose a new family of Seq2Seq models for the session-based parsing above, which also set state-of-the-art in ATIS, SNIPS, TOP and DSTC2. Notably, we improve the best known results on DSTC2 by up to 5 points for slot-carryover.

pdf bib
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing
Xilun Chen | Asish Ghoshal | Yashar Mehdad | Luke Zettlemoyer | Sonal Gupta
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user’s intents (set reminder, play music, etc.). Recent advances in deep learning have enabled several approaches to successfully parse more complex queries (Gupta et al., 2018; Rongali et al.,2020), but these models require a large amount of annotated training data to parse queries on new domains (e.g. reminder, music). In this paper, we focus on adapting task-oriented semantic parsers to low-resource domains, and propose a novel method that outperforms a supervised neural model at a 10-fold data reduction. In particular, we identify two fundamental factors for low-resource domain adaptation: better representation learning and better training techniques. Our representation learning uses BART (Lewis et al., 2019) to initialize our model which outperforms encoder-only pre-trained representations used in previous work. Furthermore, we train with optimization-based meta-learning (Finn et al., 2017) to improve generalization to low-resource domains. This approach significantly outperforms all baseline methods in the experiments on a newly collected multi-domain task-oriented semantic parsing dataset (TOPv2), which we release to the public.

pdf bib
Efficient One-Pass End-to-End Entity Linking for Questions
Belinda Z. Li | Sewon Min | Srinivasan Iyer | Yashar Mehdad | Wen-tau Yih
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Evaluated on WebQSP and GraphQuestions with extended annotations that cover multiple entities per question, ELQ outperforms the previous state of the art by a large margin of +12.7% and +19.6% F1, respectively. With a very fast inference time (1.57 examples/s on a single CPU), ELQ can be useful for downstream question answering systems. In a proof-of-concept experiment, we demonstrate that using ELQ significantly improves the downstream QA performance of GraphRetriever.

2017

pdf bib
DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging
Sheng Chen | Akshay Soni | Aasish Pappu | Yashar Mehdad
Proceedings of the 2nd Workshop on Representation Learning for NLP

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec – two popular models for learning distributed representation of words and documents. In DocTag2Vec, we simultaneously learn the representation of words, documents, and tags in a joint vector space during training, and employ the simple k-nearest neighbor search to predict tags for unseen documents. In contrast to previous multi-label learning methods, DocTag2Vec directly deals with raw text instead of provided feature vector, and in addition, enjoys advantages like the learning of tag representation, and the ability of handling newly created tags. To demonstrate the effectiveness of our approach, we conduct experiments on several datasets and show promising results against state-of-the-art methods.

2016

pdf bib
Do Characters Abuse More Than Words?
Yashar Mehdad | Joel Tetreault
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
A Low-Rank Approximation Approach to Learning Joint Embeddings of News Stories and Images for Timeline Summarization
William Yang Wang | Yashar Mehdad | Dragomir R. Radev | Amanda Stent
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Extractive Summarization under Strict Length Constraints
Yashar Mehdad | Amanda Stent | Kapil Thadani | Dragomir Radev | Youssef Billawala | Karolina Buchner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we report a comparison of various techniques for single-document extractive summarization under strict length budgets, which is a common commercial use case (e.g. summarization of news articles by news aggregators). We show that, evaluated using ROUGE, numerous algorithms from the literature fail to beat a simple lead-based baseline for this task. However, a supervised approach with lightweight and efficient features improves over the lead-based baseline. Additional human evaluation demonstrates that the supervised approach also performs competitively with a commercial system that uses more sophisticated features.

2014

pdf bib
A Template-based Abstractive Meeting Summarization: Leveraging Summary and Source Text Relationships
Tatsuro Oya | Yashar Mehdad | Giuseppe Carenini | Raymond Ng
Proceedings of the 8th International Natural Language Generation Conference (INLG)

pdf bib
Abstractive Summarization of Product Reviews Using Discourse Structure
Shima Gerani | Yashar Mehdad | Giuseppe Carenini | Raymond T. Ng | Bita Nejat
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
Yashar Mehdad | Giuseppe Carenini | Raymond T. Ng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
Shafiq Joty | Giuseppe Carenini | Raymond Ng | Yashar Mehdad
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Semeval-2013 Task 8: Cross-lingual Textual Entailment for Content Synchronization
Matteo Negri | Alessandro Marchetti | Yashar Mehdad | Luisa Bentivogli | Danilo Giampiccolo
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf bib
Towards Topic Labeling with Phrase Entailment and Aggregation
Yashar Mehdad | Giuseppe Carenini | Raymond T. Ng | Shafiq Joty
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Abstractive Meeting Summarization with Entailment and Fusion
Yashar Mehdad | Giuseppe Carenini | Frank Tompa | Raymond T. Ng
Proceedings of the 14th European Workshop on Natural Language Generation

pdf bib
Dialogue Act Recognition in Synchronous and Asynchronous Conversations
Maryam Tavafi | Yashar Mehdad | Shafiq Joty | Giuseppe Carenini | Raymond Ng
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
Yashar Mehdad | Matteo Negri | Marcello Federico
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Semeval-2012 Task 8: Cross-lingual Textual Entailment for Content Synchronization
Matteo Negri | Alessandro Marchetti | Yashar Mehdad | Luisa Bentivogli | Danilo Giampiccolo
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
FBK: Machine Translation Evaluation and Word Similarity metrics for Semantic Textual Similarity
José Guilherme Camargo de Souza | Matteo Negri | Yashar Mehdad
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
FBK: Cross-Lingual Textual Entailment Without Translation
Yashar Mehdad | Matteo Negri | José Guilherme C. de Souza
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Chinese Whispers: Cooperative Paraphrase Acquisition
Matteo Negri | Yashar Mehdad | Alessandro Marchetti | Danilo Giampiccolo | Luisa Bentivogli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a framework for the acquisition of sentential paraphrases based on crowdsourcing. The proposed method maximizes the lexical divergence between an original sentence s and its valid paraphrases by running a sequence of paraphrasing jobs carried out by a crowd of non-expert workers. Instead of collecting direct paraphrases of s, at each step of the sequence workers manipulate semantically equivalent reformulations produced in the previous round. We applied this method to paraphrase English sentences extracted from Wikipedia. Our results show that, keeping at each round n the most promising paraphrases (i.e. the more lexically dissimilar from those acquired at round n-1), the monotonic increase of divergence allows to collect good-quality paraphrases in a cost-effective manner.

pdf bib
Match without a Referee: Evaluating MT Adequacy without Reference Translations
Yashar Mehdad | Matteo Negri | Marcello Federico
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

pdf bib
Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora
Matteo Negri | Luisa Bentivogli | Yashar Mehdad | Danilo Giampiccolo | Alessandro Marchetti
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment
Yashar Mehdad | Matteo Negri | Marcello Federico
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Is it Worth Submitting this Run? Assess your RTE System with a Good Sparring Partner
Milen Kouylekov | Yashar Mehdad | Matteo Negri
Proceedings of the TextInfer 2011 Workshop on Textual Entailment

2010

pdf bib
Towards Cross-Lingual Textual Entailment
Yashar Mehdad | Matteo Negri | Marcello Federico
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Syntactic/Semantic Structures for Textual Entailment Recognition
Yashar Mehdad | Alessandro Moschitti | Fabio Massimo Zanzotto
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: $100 for a 10-day Rush
Matteo Negri | Yashar Mehdad
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk

pdf bib
Mining Wikipedia for Large-scale Repositories of Context-Sensitive Entailment Rules
Milen Kouylekov | Yashar Mehdad | Matteo Negri
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper focuses on the central role played by lexical information in the task of Recognizing Textual Entailment. In particular, the usefulness of lexical knowledge extracted from several widely used static resources, represented in the form of entailment rules, is compared with a method to extract lexical information from Wikipedia as a dynamic knowledge resource. The proposed acquisition method aims at maximizing two key features of the resulting entailment rules: coverage (i.e. the proportion of rules successfully applied over a dataset of TE pairs), and context sensitivity (i.e. the proportion of rules applied in appropriate contexts). Evaluation results show that Wikipedia can be effectively used as a source of lexical entailment rules, featuring both higher coverage and context sensitivity with respect to other resources.

2009

pdf bib
Optimizing Textual Entailment Recognition Using Particle Swarm Optimization
Yashar Mehdad | Bernardo Magnini
Proceedings of the 2009 Workshop on Applied Textual Inference (TextInfer)

pdf bib
Automatic Cost Estimation for Tree Edit Distance Using Particle Swarm Optimization
Yashar Mehdad
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers