Siddhartha Mishra
2022
Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings
Shib Sankar Dasgupta | Michael Boratko | Siddhartha Mishra | Shriya Atmakuri | Dhruvesh Patel | Xiang Lorraine Li | Andrew McCallum
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shib Sankar Dasgupta | Michael Boratko | Siddhartha Mishra | Shriya Atmakuri | Dhruvesh Patel | Xiang Lorraine Li | Andrew McCallum
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Learning representations of words in a continuous space is perhaps the most fundamental task in NLP, however words interact in ways much richer than vector dot product similarity can provide. Many relationships between words can be expressed set-theoretically, for example, adjective-noun compounds (eg. “red cars”⊆“cars”) and homographs (eg. “tongue”∩“body” should be similar to “mouth”, while “tongue”∩“language” should be similar to “dialect”) have natural set-theoretic interpretations. Box embeddings are a novel region-based representation which provide the capability to perform these set-theoretic operations. In this work, we provide a fuzzy-set interpretation of box embeddings, and learn box representations of words using a set-theoretic training objective. We demonstrate improved performance on various word similarity tasks, particularly on less common words, and perform a quantitative and qualitative analysis exploring the additional unique expressivity provided by Word2Box.
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang | Swaroop Mishra | Pegah Alipoormolabashi | Yeganeh Kordi | Amirreza Mirzaei | Atharva Naik | Arjun Ashok | Arut Selvan Dhanasekaran | Anjana Arunkumar | David Stap | Eshaan Pathak | Giannis Karamanolakis | Haizhi Lai | Ishan Purohit | Ishani Mondal | Jacob Anderson | Kirby Kuznia | Krima Doshi | Kuntal Kumar Pal | Maitreya Patel | Mehrad Moradshahi | Mihir Parmar | Mirali Purohit | Neeraj Varshney | Phani Rohitha Kaza | Pulkit Verma | Ravsehaj Singh Puri | Rushang Karia | Savan Doshi | Shailaja Keyur Sampat | Siddhartha Mishra | Sujan Reddy A | Sumanta Patro | Tanay Dixit | Xudong Shen
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Yizhong Wang | Swaroop Mishra | Pegah Alipoormolabashi | Yeganeh Kordi | Amirreza Mirzaei | Atharva Naik | Arjun Ashok | Arut Selvan Dhanasekaran | Anjana Arunkumar | David Stap | Eshaan Pathak | Giannis Karamanolakis | Haizhi Lai | Ishan Purohit | Ishani Mondal | Jacob Anderson | Kirby Kuznia | Krima Doshi | Kuntal Kumar Pal | Maitreya Patel | Mehrad Moradshahi | Mihir Parmar | Mirali Purohit | Neeraj Varshney | Phani Rohitha Kaza | Pulkit Verma | Ravsehaj Singh Puri | Rushang Karia | Savan Doshi | Shailaja Keyur Sampat | Siddhartha Mishra | Sujan Reddy A | Sumanta Patro | Tanay Dixit | Xudong Shen
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition. This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions—training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones.Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). Our experiments show that Tk-Instruct outperforms existing instruction-following models such as InstructGPT by over 9% on our benchmark despite being an order of magnitude smaller. We further analyze generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances per task, and model sizes. We hope our dataset and model facilitate future progress towards more general-purpose NLP models.
Search
Fix author
Co-authors
- Sujan Reddy A 1
- Pegah Alipoormolabashi 1
- Jacob Anderson 1
- Anjana Arunkumar 1
- Arjun Ashok 1
- Shriya Atmakuri 1
- Michael Boratko 1
- Shib Sankar Dasgupta 1
- Arut Selvan Dhanasekaran 1
- Tanay Dixit 1
- Krima Doshi 1
- Savan Doshi 1
- Giannis Karamanolakis 1
- Rushang Karia 1
- Phani Rohitha Kaza 1
- Yeganeh Kordi 1
- Kirby Kuznia 1
- Haizhi Lai 1
- Xiang Lorraine Li 1
- Andrew McCallum 1
- Amirreza Mirzaei 1
- Swaroop Mishra 1
- Ishani Mondal 1
- Mehrad Moradshahi 1
- Atharva Naik 1
- Kuntal Kumar Pal 1
- Mihir Parmar 1
- Dhruvesh Patel 1
- Maitreya Patel 1
- Eshaan Pathak 1
- Sumanta Patro 1
- Ravsehaj Singh Puri 1
- Ishan Purohit 1
- Mirali Purohit 1
- Shailaja Keyur Sampat 1
- Xudong Shen 1
- David Stap 1
- Neeraj Varshney 1
- Pulkit Verma 1
- Yizhong Wang 1