Pradeep Ravikumar


pdf bib
Improving Compositional Generalization in Classification Tasks via Structure Annotations
Juyong Kim | Pradeep Ravikumar | Joshua Ainslie | Santiago Ontanon
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Compositional generalization is the ability to generalize systematically to a new data distribution by combining known components. Although humans seem to have a great ability to generalize compositionally, state-of-the-art neural models struggle to do so. In this work, we study compositional generalization in classification tasks and present two main contributions. First, we study ways to convert a natural language sequence-to-sequence dataset to a classification dataset that also requires compositional generalization. Second, we show that providing structural hints (specifically, providing parse trees and entity links as attention masks for a Transformer model) helps compositional generalization.


pdf bib
Word Mover’s Embedding: From Word2Vec to Document Embedding
Lingfei Wu | Ian En-Hsu Yen | Kun Xu | Fangli Xu | Avinash Balakrishnan | Pin-Yu Chen | Pradeep Ravikumar | Michael J. Witbrock
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called Word Mover’s Distance (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the Word Mover’s Embedding (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.