Francis Ferraro


pdf bib
Learning a Reversible Embedding Mapping using Bi-Directional Manifold Alignment
Ashwinkumar Ganesan | Francis Ferraro | Tim Oates
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Event Representation with Sequential, Semi-Supervised Discrete Variables
Mehdi Rezaee | Francis Ferraro
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly.

pdf bib
Locality Preserving Loss: Neighbors that Live together, Align together
Ashwinkumar Ganesan | Francis Ferraro | Tim Oates
Proceedings of the Second Workshop on Domain Adaptation for NLP

We present a locality preserving loss (LPL) that improves the alignment between vector space embeddings while separating uncorrelated representations. Given two pretrained embedding manifolds, LPL optimizes a model to project an embedding and maintain its local neighborhood while aligning one manifold to another. This reduces the overall size of the dataset required to align the two in tasks such as crosslingual word alignment. We show that the LPL-based alignment between input vector spaces acts as a regularizer, leading to better and consistent accuracy than the baseline, especially when the size of the training set is small. We demonstrate the effectiveness of LPL-optimized alignment on semantic text similarity (STS), natural language inference (SNLI), multi-genre language inference (MNLI) and cross-lingual word alignment (CLA) showing consistent improvements, finding up to 16% improvement over our baseline in lower resource settings.


pdf bib
On the Complementary Nature of Knowledge Graph Embedding, Fine Grain Entity Types, and Language Modeling
Rajat Patel | Francis Ferraro
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

We demonstrate the complementary natures of neural knowledge graph embedding, fine-grain entity type prediction, and neural language modeling. We show that a language model-inspired knowledge graph embedding approach yields both improved knowledge graph embeddings and fine-grain entity type representations. Our work also shows that jointly modeling both structured knowledge tuples and language improves both.

pdf bib
The Universal Decompositional Semantics Dataset and Decomp Toolkit
Aaron Steven White | Elias Stengel-Eskin | Siddharth Vashishtha | Venkata Subrahmanyan Govindarajan | Dee Ann Reisinger | Tim Vieira | Keisuke Sakaguchi | Sheng Zhang | Francis Ferraro | Rachel Rudinger | Kyle Rawlins | Benjamin Van Durme
Proceedings of the 12th Language Resources and Evaluation Conference

We present the Universal Decompositional Semantics (UDS) dataset (v1.0), which is bundled with the Decomp toolkit (v0.1). UDS1.0 unifies five high-quality, decompositional semantics-aligned annotation sets within a single semantic graph specification—with graph structures defined by the predicative patterns produced by the PredPatt tool and real-valued node and edge attributes constructed using sophisticated normalization procedures. The Decomp toolkit provides a suite of Python 3 tools for querying UDS graphs using SPARQL. Both UDS1.0 and Decomp0.1 are publicly available at


pdf bib
¿Es un plátano? Exploring the Application of a Physically Grounded Language Acquisition System to Spanish
Caroline Kery | Francis Ferraro | Cynthia Matuszek
Proceedings of the Combined Workshop on Spatial Language Understanding (SpLU) and Grounded Communication for Robotics (RoboNLP)

In this paper we describe a multilingual grounded language learning system adapted from an English-only system. This system learns the meaning of words used in crowd-sourced descriptions by grounding them in the physical representations of the objects they are describing. Our work presents a framework to compare the performance of the system when applied to a new language and to identify modifications necessary to attain equal performance, with the goal of enhancing the ability of robots to learn language from a more diverse range of people. We then demonstrate this system with Spanish, through first analyzing the performance of translated Spanish, and then extending this analysis to a new corpus of crowd-sourced Spanish language data. We find that with small modifications, the system is able to learn color, object, and shape words with comparable performance between languages.

pdf bib
Proceedings of the Second Workshop on Storytelling
Francis Ferraro | Ting-Hao ‘Kenneth’ Huang | Stephanie M. Lukin | Margaret Mitchell
Proceedings of the Second Workshop on Storytelling


pdf bib
Proceedings of the First Workshop on Storytelling
Margaret Mitchell | Ting-Hao ‘Kenneth’ Huang | Francis Ferraro | Ishan Misra
Proceedings of the First Workshop on Storytelling

pdf bib
Team UMBC-FEVER : Claim verification using Semantic Lexical Resources
Ankur Padia | Francis Ferraro | Tim Finin
Proceedings of the First Workshop on Fact Extraction and VERification (FEVER)

We describe our system used in the 2018 FEVER shared task. The system employed a frame-based information retrieval approach to select Wikipedia sentences providing evidence and used a two-layer multilayer perceptron to classify a claim as correct or not. Our submission achieved a score of 0.3966 on the Evidence F1 metric with accuracy of 44.79%, and FEVER score of 0.2628 F1 points.

pdf bib
UMBC at SemEval-2018 Task 8: Understanding Text about Malware
Ankur Padia | Arpita Roy | Taneeya Satyapanich | Francis Ferraro | Shimei Pan | Youngja Park | Anupam Joshi | Tim Finin
Proceedings of The 12th International Workshop on Semantic Evaluation

We describe the systems developed by the UMBC team for 2018 SemEval Task 8, SecureNLP (Semantic Extraction from CybersecUrity REports using Natural Language Processing). We participated in three of the sub-tasks: (1) classifying sentences as being relevant or irrelevant to malware, (2) predicting token labels for sentences, and (4) predicting attribute labels from the Malware Attribute Enumeration and Characterization vocabulary for defining malware characteristics. We achieve F1 score of 50.34/18.0 (dev/test), 22.23 (test-data), and 31.98 (test-data) for Task1, Task2 and Task2 respectively. We also make our cybersecurity embeddings publicly available at


pdf bib
Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles
Francis Ferraro | Adam Poliak | Ryan Cotterell | Benjamin Van Durme
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

We study how different frame annotations complement one another when learning continuous lexical semantics. We learn the representations from a tensorized skip-gram model that consistently encodes syntactic-semantic content better, with multiple 10% gains over baselines.


pdf bib
Visual Storytelling
Ting-Hao Kenneth Huang | Francis Ferraro | Nasrin Mostafazadeh | Ishan Misra | Aishwarya Agrawal | Jacob Devlin | Ross Girshick | Xiaodong He | Pushmeet Kohli | Dhruv Batra | C. Lawrence Zitnick | Devi Parikh | Lucy Vanderwende | Michel Galley | Margaret Mitchell
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies


pdf bib
Semantic Proto-Roles
Drew Reisinger | Rachel Rudinger | Francis Ferraro | Craig Harman | Kyle Rawlins | Benjamin Van Durme
Transactions of the Association for Computational Linguistics, Volume 3

We present the first large-scale, corpus based verification of Dowty’s seminal theory of proto-roles. Our results demonstrate both the need for and the feasibility of a property-based annotation scheme of semantic relationships, as opposed to the currently dominant notion of categorical roles.

pdf bib
A Survey of Current Datasets for Vision and Language Research
Francis Ferraro | Nasrin Mostafazadeh | Ting-Hao Huang | Lucy Vanderwende | Jacob Devlin | Michel Galley | Margaret Mitchell
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Script Induction as Language Modeling
Rachel Rudinger | Pushpendre Rastogi | Francis Ferraro | Benjamin Van Durme
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Topic Identification and Discovery on Text and Speech
Chandler May | Francis Ferraro | Alan McCree | Jonathan Wintrode | Daniel Garcia-Romero | Benjamin Van Durme
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Concrete Chinese NLP Pipeline
Nanyun Peng | Francis Ferraro | Mo Yu | Nicholas Andrews | Jay DeYoung | Max Thomas | Matthew R. Gormley | Travis Wolfe | Craig Harman | Benjamin Van Durme | Mark Dredze
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations


pdf bib
A Virtual Manipulative for Learning Log-Linear Models
Francis Ferraro | Jason Eisner
Proceedings of the Fourth Workshop on Teaching NLP and CL


pdf bib
Toward Tree Substitution Grammars with Latent Annotations
Francis Ferraro | Benjamin Van Durme | Matt Post
Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure

pdf bib
Judging Grammaticality with Count-Induced Tree Substitution Grammars
Francis Ferraro | Matt Post | Benjamin Van Durme
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP