Ryan McDonald


2021

pdf bib
Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation
Ji Ma | Ivan Korotkov | Yinfei Yang | Keith Hall | Ryan McDonald
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

A major obstacle to the wide-spread adoption of neural retrieval models is that they require large supervised training sets to surpass traditional term-based techniques, which are constructed from raw corpora. In this paper, we propose an approach to zero-shot learning for passage retrieval that uses synthetic question generation to close this gap. The question generation system is trained on general domain data, but is applied to documents in the targeted domain. This allows us to create arbitrarily large, yet noisy, question-passage relevance pairs that are domain specific. Furthermore, when this is coupled with a simple hybrid term-neural model, first-stage retrieval performance can be improved further. Empirically, we show that this is an effective strategy for building neural passage retrieval models in the absence of large training corpora. Depending on the domain, this technique can even approach the accuracy of supervised models.

pdf bib
Leveraging Type Descriptions for Zero-shot Named Entity Recognition and Classification
Rami Aly | Andreas Vlachos | Ryan McDonald
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

A common issue in real-world applications of named entity recognition and classification (NERC) is the absence of annotated data for the target entity classes during training. Zero-shot learning approaches address this issue by learning models from classes with training data that can predict classes without it. This paper presents the first approach for zero-shot NERC, introducing novel architectures that leverage the fact that textual descriptions for many entity classes occur naturally. We address the zero-shot NERC specific challenge that the not-an-entity class is not well defined as different entity classes are considered in training and testing. For evaluation, we adapt two datasets, OntoNotes and MedMentions, emulating the difficulty of real-world zero-shot learning by testing models on the rarest entity classes. Our proposed approach outperforms baselines adapted from machine reading comprehension and zero-shot text classification. Furthermore, we assess the effect of different class descriptions for this task.

pdf bib
Focus Attention: Promoting Faithfulness and Diversity in Summarization
Rahul Aralikatte | Shashi Narayan | Joshua Maynez | Sascha Rothe | Ryan McDonald
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Professional summaries are written with document-level information, such as the theme of the document, in mind. This is in contrast with most seq2seq decoders which simultaneously learn to focus on salient content, while deciding what to generate, at each decoding step. With the motivation to narrow this gap, we introduce Focus Attention Mechanism, a simple yet effective method to encourage decoders to proactively generate tokens that are similar or topical to the input document. Further, we propose a Focus Sampling method to enable generation of diverse summaries, an area currently understudied in summarization. When evaluated on the BBC extreme summarization task, two state-of-the-art models augmented with Focus Attention generate summaries that are closer to the target and more faithful to their input documents, outperforming their vanilla counterparts on ROUGE and multiple faithfulness measures. We also empirically demonstrate that Focus Sampling is more effective in generating diverse and faithful summaries than top-k or nucleus sampling-based decoding methods.

pdf bib
Proceedings of the Second Workshop on Domain Adaptation for NLP
Eyal Ben-David | Shay Cohen | Ryan McDonald | Barbara Plank | Roi Reichart | Guy Rotman | Yftah Ziser
Proceedings of the Second Workshop on Domain Adaptation for NLP

2020

pdf bib
BioMRC: A Dataset for Biomedical Machine Reading Comprehension
Dimitris Pappas | Petros Stavropoulos | Ion Androutsopoulos | Ryan McDonald
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

We introduceBIOMRC, a large-scale cloze-style biomedical MRC dataset. Care was taken to reduce noise, compared to the previous BIOREAD dataset of Pappas et al. (2018). Experiments show that simple heuristics do not perform well on the new dataset and that two neural MRC models that had been tested on BIOREAD perform much better on BIOMRC, indicating that the new dataset is indeed less noisy or at least that its task is more feasible. Non-expert human performance is also higher on the new dataset compared to BIOREAD, and biomedical experts perform even better. We also introduce a new BERT-based MRC model, the best version of which substantially outperforms all other methods tested, reaching or surpassing the accuracy of biomedical experts in some experiments. We make the new dataset available in three different sizes, also releasing our code, and providing a leaderboard.

pdf bib
On Faithfulness and Factuality in Abstractive Summarization
Joshua Maynez | Shashi Narayan | Bernd Bohnet | Ryan McDonald
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

It is well known that the standard likelihood training and approximate decoding objectives in neural text generation models lead to less human-like responses for open-ended tasks such as language modeling and story generation. In this paper we have analyzed limitations of these models for abstractive document summarization and found that these models are highly prone to hallucinate content that is unfaithful to the input document. We conducted a large scale human evaluation of several neural abstractive summarization systems to better understand the types of hallucinations they produce. Our human annotators found substantial amounts of hallucinated content in all model generated summaries. However, our analysis does show that pretrained models are better summarizers not only in terms of raw metrics, i.e., ROUGE, but also in generating faithful and factual summaries as evaluated by humans. Furthermore, we show that textual entailment measures better correlate with faithfulness than standard metrics, potentially leading the way to automatic evaluation metrics as well as training and decoding criteria.

pdf bib
Stepwise Extractive Summarization and Planning with Structured Transformers
Shashi Narayan | Joshua Maynez | Jakub Adamek | Daniele Pighin | Blaz Bratanic | Ryan McDonald
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose encoder-centric stepwise models for extractive summarization using structured transformers – HiBERT and Extended Transformers. We enable stepwise summarization by injecting the previously generated summary into the structured transformer as an auxiliary sub-structure. Our models are not only efficient in modeling the structure of long inputs, but they also do not rely on task-specific redundancy-aware modeling, making them a general purpose extractive content planner for different tasks. When evaluated on CNN/DailyMail extractive summarization, stepwise models achieve state-of-the-art performance in terms of Rouge without any redundancy aware modeling or sentence filtering. This also holds true for Rotowire table-to-text generation, where our models surpass previously reported metrics for content selection, planning and ordering, highlighting the strength of stepwise modeling. Amongst the two structured transformers we test, stepwise Extended Transformers provides the best performance across both datasets and sets a new standard for these challenges.

2019

pdf bib
Embedding Biomedical Ontologies by Jointly Encoding Network Structure and Textual Node Descriptors
Sotiris Kotitsas | Dimitris Pappas | Ion Androutsopoulos | Ryan McDonald | Marianna Apidianaki
Proceedings of the 18th BioNLP Workshop and Shared Task

Network Embedding (NE) methods, which map network nodes to low-dimensional feature vectors, have wide applications in network analysis and bioinformatics. Many existing NE methods rely only on network structure, overlooking other information associated with the nodes, e.g., text describing the nodes. Recent attempts to combine the two sources of information only consider local network structure. We extend NODE2VEC, a well-known NE method that considers broader network structure, to also consider textual node descriptors using recurrent neural encoders. Our method is evaluated on link prediction in two networks derived from UMLS. Experimental results demonstrate the effectiveness of the proposed approach compared to previous work.

2018

pdf bib
Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings
Bernd Bohnet | Ryan McDonald | Gonçalo Simões | Daniel Andor | Emily Pitler | Joshua Maynez
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The rise of neural networks, and particularly recurrent neural networks, has produced significant advances in part-of-speech tagging accuracy. One characteristic common among these models is the presence of rich initial word encodings. These encodings typically are composed of a recurrent character-based representation with dynamically and pre-trained word embeddings. However, these encodings do not consider a context wider than a single word and it is only through subsequent recurrent layers that word or sub-word information interacts. In this paper, we investigate models that use recurrent neural networks with sentence-level context for initial character and word-based representations. In particular we show that optimal results are obtained by integrating these context sensitive representations through synchronized training with a meta-model that learns to combine their states.

pdf bib
AUEB at BioASQ 6: Document and Snippet Retrieval
George Brokos | Polyvios Liosis | Ryan McDonald | Dimitris Pappas | Ion Androutsopoulos
Proceedings of the 6th BioASQ Workshop A challenge on large-scale biomedical semantic indexing and question answering

We present AUEB’s submissions to the BioASQ 6 document and snippet retrieval tasks (parts of Task 6b, Phase A). Our models use novel extensions to deep learning architectures that operate solely over the text of the query and candidate document/snippets. Our systems scored at the top or near the top for all batches of the challenge, highlighting the effectiveness of deep learning for these tasks.

pdf bib
Deep Relevance Ranking Using Enhanced Document-Query Interactions
Ryan McDonald | George Brokos | Ion Androutsopoulos
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We explore several new models for document relevance ranking, building upon the Deep Relevance Matching Model (DRMM) of Guo et al. (2016). Unlike DRMM, which uses context-insensitive encodings of terms and query-document term interactions, we inject rich context-sensitive encodings throughout our models, inspired by PACRR’s (Hui et al., 2017) convolutional n-gram matching features, but extended in several ways including multiple views of query and document inputs. We test our models on datasets from the BIOASQ question answering challenge (Tsatsaronis et al., 2015) and TREC ROBUST 2004 (Voorhees, 2005), showing they outperform BM25-based baselines, DRMM, and PACRR.

2017

pdf bib
Natural Language Processing with Small Feed-Forward Networks
Jan A. Botha | Emily Pitler | Ji Ma | Anton Bakalov | Alex Salcianu | David Weiss | Ryan McDonald | Slav Petrov
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory budget.

2016

pdf bib
Generalized Transition-based Dependency Parsing via Control Parameters
Bernd Bohnet | Ryan McDonald | Emily Pitler | Ji Ma
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Universal Dependencies v1: A Multilingual Treebank Collection
Joakim Nivre | Marie-Catherine de Marneffe | Filip Ginter | Yoav Goldberg | Jan Hajič | Christopher D. Manning | Ryan McDonald | Slav Petrov | Sampo Pyysalo | Natalia Silveira | Reut Tsarfaty | Daniel Zeman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.

pdf bib
Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning
Manaal Faruqui | Ryan McDonald | Radu Soricut
Transactions of the Association for Computational Linguistics, Volume 4

Morpho-syntactic lexicons provide information about the morphological and syntactic roles of words in a language. Such lexicons are not available for all languages and even when available, their coverage can be limited. We present a graph-based semi-supervised learning method that uses the morphological, syntactic and semantic relations between words to automatically construct wide coverage lexicons from small seed sets. Our method is language-independent, and we show that we can expand a 1000 word seed lexicon to more than 100 times its size with high quality for 11 languages. In addition, the automatically created lexicons provide features that improve performance in two downstream tasks: morphological tagging and dependency parsing.

2015

pdf bib
A Linear-Time Transition System for Crossing Interval Trees
Emily Pitler | Ryan McDonald
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Squibs: Constrained Arc-Eager Dependency Parsing
Joakim Nivre | Yoav Goldberg | Ryan McDonald
Computational Linguistics, Volume 40, Issue 2 - June 2014

pdf bib
Adapting taggers to Twitter with not-so-distant supervision
Barbara Plank | Dirk Hovy | Ryan McDonald | Anders Søgaard
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Enforcing Structural Diversity in Cube-pruned Dependency Parsing
Hao Zhang | Ryan McDonald
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Online Learning for Inexact Hypergraph Search
Hao Zhang | Liang Huang | Kai Zhao | Ryan McDonald
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Target Language Adaptation of Discriminative Transfer Parsers
Oscar Täckström | Ryan McDonald | Joakim Nivre
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging
Oscar Täckström | Dipanjan Das | Slav Petrov | Ryan McDonald | Joakim Nivre
Transactions of the Association for Computational Linguistics, Volume 1

We consider the construction of part-of-speech taggers for resource-poor languages. Recently, manually constructed tag dictionaries from Wiktionary and dictionaries projected via bitext have been used as type constraints to overcome the scarcity of annotated data in this setting. In this paper, we show that additional token constraints can be projected from a resource-rich source language to a resource-poor target language via word-aligned bitext. We present several models to this end; in particular a partially observed conditional random field model, where coupled token and type constraints provide a partial signal for training. Averaged across eight previously studied Indo-European languages, our model achieves a 25% relative error reduction over the prior state of the art. We further present successful results on seven additional languages from different families, empirically demonstrating the applicability of coupled token and type constraints across a diverse set of languages.

pdf bib
Universal Dependency Annotation for Multilingual Parsing
Ryan McDonald | Joakim Nivre | Yvonne Quirmbach-Brundage | Yoav Goldberg | Dipanjan Das | Kuzman Ganchev | Keith Hall | Slav Petrov | Hao Zhang | Oscar Täckström | Claudia Bedini | Núria Bertomeu Castelló | Jungmee Lee
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Generalized Higher-Order Dependency Parsing with Cube Pruning
Hao Zhang | Ryan McDonald
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
A Universal Part-of-Speech Tagset
Slav Petrov | Dipanjan Das | Ryan McDonald
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories. In addition to the tagset, we develop a mapping from 25 different treebank tagsets to this universal set. As a result, when combined with the original treebank data, this universal tagset and mapping produce a dataset consisting of common parts-of-speech for 22 different languages. We highlight the use of this resource via three experiments, that (1) compare tagging accuracies across languages, (2) present an unsupervised grammar induction approach that does not use gold standard part-of-speech tags, and (3) use the universal tags to transfer dependency parsers between languages, achieving state-of-the-art results.

pdf bib
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
Oscar Täckström | Ryan McDonald | Jakob Uszkoreit
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Using Search-Logs to Improve Query Tagging
Kuzman Ganchev | Keith Hall | Ryan McDonald | Slav Petrov
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Analyzing and Integrating Dependency Parsers
Ryan McDonald | Joakim Nivre
Computational Linguistics, Volume 37, Issue 1 - March 2011

pdf bib
Multi-Source Transfer of Delexicalized Dependency Parsers
Ryan McDonald | Slav Petrov | Keith Hall
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Training a Parser for Machine Translation Reordering
Jason Katz-Brown | Slav Petrov | Ryan McDonald | Franz Och | David Talbot | Hiroshi Ichikawa | Masakazu Seno | Hideto Kazawa
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Training dependency parsers by jointly optimizing multiple objectives
Keith Hall | Ryan McDonald | Jason Katz-Brown | Michael Ringgaard
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Semi-supervised latent variable models for sentence-level sentiment analysis
Oscar Täckström | Ryan McDonald
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Distributed Training Strategies for the Structured Perceptron
Ryan McDonald | Keith Hall | Gideon Mann
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
The viability of web-derived polarity lexicons
Leonid Velikovich | Sasha Blair-Goldensohn | Kerry Hannan | Ryan McDonald
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
What's great and what's not: learning to classify the scope of negation for improved sentiment analysis
Isaac Councill | Ryan McDonald | Leonid Velikovich
Proceedings of the Workshop on Negation and Speculation in Natural Language Processing

pdf bib
Evaluation of Dependency Parsers on Unbounded Dependencies
Joakim Nivre | Laura Rimell | Ryan McDonald | Carlos Gómez-Rodríguez
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Sentiment Summarization: Evaluating and Learning User Preferences
Kevin Lerman | Sasha Blair-Goldensohn | Ryan McDonald
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Contrastive Summarization: An Experiment with Consumer Reviews
Kevin Lerman | Ryan McDonald
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
A Joint Model of Text and Aspect Ratings for Sentiment Summarization
Ivan Titov | Ryan McDonald
Proceedings of ACL-08: HLT

pdf bib
Integrating Graph-Based and Transition-Based Dependency Parsers
Joakim Nivre | Ryan McDonald
Proceedings of ACL-08: HLT

2007

pdf bib
On the Complexity of Non-Projective Data-Driven Dependency Parsing
Ryan McDonald | Giorgio Satta
Proceedings of the Tenth International Conference on Parsing Technologies

pdf bib
Characterizing the Errors of Data-Driven Dependency Parsing Models
Ryan McDonald | Joakim Nivre
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
The CoNLL 2007 Shared Task on Dependency Parsing
Joakim Nivre | Johan Hall | Sandra Kübler | Ryan McDonald | Jens Nilsson | Sebastian Riedel | Deniz Yuret
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Structured Models for Fine-to-Coarse Sentiment Analysis
Ryan McDonald | Kerry Hannan | Tyler Neylon | Mike Wells | Jeff Reynar
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Domain Adaptation with Structural Correspondence Learning
John Blitzer | Ryan McDonald | Fernando Pereira
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Multilingual Dependency Analysis with a Two-Stage Discriminative Parser
Ryan McDonald | Kevin Lerman | Fernando Pereira
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Ryan McDonald | Charles Sutton | Hal Daumé III | Andrew McCallum | Fernando Pereira | Jeff Bilmes
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing

pdf bib
Online Learning of Approximate Dependency Parsing Algorithms
Ryan McDonald | Fernando Pereira
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Discriminative Sentence Compression with Soft Syntactic Evidence
Ryan McDonald
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf bib
Non-Projective Dependency Parsing using Spanning Tree Algorithms
Ryan McDonald | Fernando Pereira | Kiril Ribarov | Jan Hajič
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Flexible Text Segmentation with Structured Multilabel Classification
Ryan McDonald | Koby Crammer | Fernando Pereira
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Online Large-Margin Training of Dependency Parsers
Ryan McDonald | Koby Crammer | Fernando Pereira
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE
Ryan McDonald | Fernando Pereira | Seth Kulick | Scott Winters | Yang Jin | Pete White
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Integrated Annotation for Biomedical Information Extraction
Seth Kulick | Ann Bies | Mark Liberman | Mark Mandel | Ryan McDonald | Martha Palmer | Andrew Schein | Lyle Ungar | Scott Winters | Pete White
HLT-NAACL 2004 Workshop: Linking Biological Literature, Ontologies and Databases

Search
Co-authors