Sameer Pradhan

Also published as: S. Pradhan, Sameer S. Pradhan


2023

pdf bib
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts
Fabio Massimo Zanzotto | Sameer Pradhan
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
Incorporating Singletons and Mention-based Features in Coreference Resolution via Multi-task Learning for Better Generalization
Yilun Zhu | Siyao Peng | Sameer Pradhan | Amir Zeldes
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)
Maciej Ogrodniczuk | Vincent Ng | Sameer Pradhan | Massimo Poesio
Proceedings of The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)

pdf bib
The Universal Anaphora Scorer 2.0
Juntao Yu | Michal Novák | Abdulrahman Aloraini | Nafise Sadat Moosavi | Silviu Paun | Sameer Pradhan | Massimo Poesio
Proceedings of the 15th International Conference on Computational Semantics

The aim of the Universal Anaphora initiative is to push forward the state of the art both in anaphora (coreference) annotation and in the evaluation of models for anaphora resolution. The first release of the Universal Anaphora Scorer (Yu et al., 2022b) supported the scoring not only of identity anaphora as in the Reference Coreference Scorer (Pradhan et al., 2014) but also of split antecedent anaphoric reference, bridging references, and discourse deixis. That scorer was used in the CODI-CRAC 2021/2022 Shared Tasks on Anaphora Resolution in Dialogues (Khosla et al., 2021; Yu et al., 2022a). A modified version of the scorer supporting discontinuous markables and the COREFUD markup format was also used in the CRAC 2022 Shared Task on Multilingual Coreference Resolution (Zabokrtsky et al., 2022). In this paper, we introduce the second release of the scorer, merging the two previous versions, which can score reference with discontinuous markables and zero anaphora resolution.

2022

pdf bib
The Universal Anaphora Scorer
Juntao Yu | Sopan Khosla | Nafise Sadat Moosavi | Silviu Paun | Sameer Pradhan | Massimo Poesio
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The aim of the Universal Anaphora initiative is to push forward the state of the art in anaphora and anaphora resolution by expanding the aspects of anaphoric interpretation which are or can be reliably annotated in anaphoric corpora, producing unified standards to annotate and encode these annotations, deliver datasets encoded according to these standards, and developing methods for evaluating models carrying out this type of interpretation. Such expansion of the scope of anaphora resolution requires a comparable expansion of the scope of the scorers used to evaluate this work. In this paper, we introduce an extended version of the Reference Coreference Scorer (Pradhan et al., 2014) that can be used to evaluate the extended range of anaphoric interpretation included in the current Universal Anaphora proposal. The UA scorer supports the evaluation of identity anaphora resolution and of bridging reference resolution, for which scorers already existed but not integrated in a single package. It also supports the evaluation of split antecedent anaphora and discourse deixis, for which no tools existed. The proposed approach to the evaluation of split antecedent anaphora is entirely novel; the proposed approach to the evaluation of discourse deixis leverages the encoding of discourse deixis proposed in Universal Anaphora to enable the use for discourse deixis of the same metrics already used for identity anaphora. The scorer was tested in the recent CODI-CRAC 2021 Shared Task on Anaphora Resolution in Dialogues.

pdf bib
Joint Coreference Resolution for Zeros and non-Zeros in Arabic
Abdulrahman Aloraini | Sameer Pradhan | Massimo Poesio
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)

Most existing proposals about anaphoric zero pronoun (AZP) resolution regard full mention coreference and AZP resolution as two independent tasks, even though the two tasks are clearly related. The main issues that need tackling to develop a joint model for zero and non-zero mentions are the difference between the two types of arguments (zero pronouns, being null, provide no nominal information) and the lack of annotated datasets of a suitable size in which both types of arguments are annotated for languages other than Chinese and Japanese. In this paper, we introduce two architectures for jointly resolving AZPs and non-AZPs, and evaluate them on Arabic, a language for which, as far as we know, there has been no prior work on joint resolution. Doing this also required creating a new version of the Arabic subset of the standard coreference resolution dataset used for the CoNLL-2012 shared task (Pradhan et al.,2012) in which both zeros and non-zeros are included in a single dataset.

pdf bib
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
Sameer Pradhan | Sandra Kuebler
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022

pdf bib
GRAILGeneralized Representation and Aggregation of Information Layers
Sameer Pradhan | Mark Liberman
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022

This paper identifies novel characteristics necessary to successfully represent multiple streams of natural language information from speech and text simultaneously, and proposes a multi-tiered system that implements these characteristics centered around a declarative configuration. The system facilitates easy incremental extension by allowing the creation of composable workflows of loosely coupled extensions, or plugins, allowing simple intial systems to be extended to accomodate rich representations while maintaining high data integrity. Key to this is leveraging established tools and technologies. We demonstrate using a small example.

pdf bib
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Anna Nedoluzhko | Vincent Ng | Massimo Poesio
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference

pdf bib
PropBank Comes of Age—Larger, Smarter, and more Diverse
Sameer Pradhan | Julia Bonn | Skatje Myers | Kathryn Conger | Tim O’gorman | James Gung | Kristin Wright-bettner | Martha Palmer
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

This paper describes the evolution of the PropBank approach to semantic role labeling over the last two decades. During this time the PropBank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems

2021

pdf bib
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Massimo Poesio | Yulia Grishina | Vincent Ng
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference

pdf bib
Anatomy of OntoGUMAdapting GUM to the OntoNotes Scheme to Evaluate Robustness of SOTA Coreference Algorithms
Yilun Zhu | Sameer Pradhan | Amir Zeldes
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference

SOTA coreference resolution produces increasingly impressive scores on the OntoNotes benchmark. However lack of comparable data following the same scheme for more genres makes it difficult to evaluate generalizability to open domain data. Zhu et al. (2021) introduced the creation of the OntoGUM corpus for evaluating geralizability of the latest neural LM-based end-to-end systems. This paper covers details of the mapping process which is a set of deterministic rules applied to the rich syntactic and discourse annotations manually annotated in the GUM corpus. Out-of-domain evaluation across 12 genres shows nearly 15-20% degradation for both deterministic and deep learning systems, indicating a lack of generalizability or covert overfitting in existing coreference resolution models.

pdf bib
OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12 More Genres
Yilun Zhu | Sameer Pradhan | Amir Zeldes
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

SOTA coreference resolution produces increasingly impressive scores on the OntoNotes benchmark. However lack of comparable data following the same scheme for more genres makes it difficult to evaluate generalizability to open domain data. This paper provides a dataset and comprehensive evaluation showing that the latest neural LM based end-to-end systems degrade very substantially out of domain. We make an OntoNotes-like coreference dataset called OntoGUM publicly available, converted from GUM, an English corpus covering 12 genres, using deterministic rules, which we evaluate. Thanks to the rich syntactic and discourse annotations in GUM, we are able to create the largest human-annotated coreference corpus following the OntoNotes guidelines, and the first to be evaluated for consistency with the OntoNotes scheme. Out-of-domain evaluation across 12 genres shows nearly 15-20% degradation for both deterministic and deep learning systems, indicating a lack of generalizability or covert overfitting in existing coreference resolution models.

2020

pdf bib
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Vincent Ng | Yulia Grishina | Sameer Pradhan
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

2019

pdf bib
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference
Maciej Ogrodniczuk | Sameer Pradhan | Yulia Grishina | Vincent Ng
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference

2018

pdf bib
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Agata Savary | Carlos Ramisch | Jena D. Hwang | Nathan Schneider | Melanie Andresen | Sameer Pradhan | Miriam R. L. Petruck
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

pdf bib
The New Propbank: Aligning Propbank with AMR through POS Unification
Tim O’Gorman | Sameer Pradhan | Martha Palmer | Julia Bonn | Katie Conger | James Gung
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
CAMR at SemEval-2016 Task 8: An Extended Transition-based AMR Parser
Chuan Wang | Sameer Pradhan | Xiaoman Pan | Heng Ji | Nianwen Xue
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
CoNLL 2016 Shared Task on Multilingual Shallow Discourse Parsing
Nianwen Xue | Hwee Tou Ng | Sameer Pradhan | Attapol Rutherford | Bonnie Webber | Chuan Wang | Hongmin Wang
Proceedings of the CoNLL-16 shared task

pdf bib
Proceedings of ACL-2016 System Demonstrations
Sameer Pradhan | Marianna Apidianaki
Proceedings of ACL-2016 System Demonstrations

pdf bib
My Science Tutor—Learning Science with a Conversational Virtual Tutor
Sameer Pradhan | Ron Cole | Wayne Ward
Proceedings of ACL-2016 System Demonstrations

2015

pdf bib
SemEval-2015 Task 14: Analysis of Clinical Text
Noémie Elhadad | Sameer Pradhan | Sharon Gorman | Suresh Manandhar | Wendy Chapman | Guergana Savova
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
The CoNLL-2015 Shared Task on Shallow Discourse Parsing
Nianwen Xue | Hwee Tou Ng | Sameer Pradhan | Rashmi Prasad | Christopher Bryant | Attapol Rutherford
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task

pdf bib
Boosting Transition-based AMR Parsing with Refined Actions and Auxiliary Analyzers
Chuan Wang | Nianwen Xue | Sameer Pradhan
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
A Transition-based Algorithm for AMR Parsing
Chuan Wang | Nianwen Xue | Sameer Pradhan
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Bridging Sentential and Discourse-level Semantics through Clausal Adjuncts
Rashmi Prasad | Bonnie Webber | Alan Lee | Sameer Pradhan | Aravind Joshi
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

2014

pdf bib
Temporal Annotation in the Clinical Domain
William F. Styler IV | Steven Bethard | Sean Finan | Martha Palmer | Sameer Pradhan | Piet C de Groen | Brad Erickson | Timothy Miller | Chen Lin | Guergana Savova | James Pustejovsky
Transactions of the Association for Computational Linguistics, Volume 2

This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task.

pdf bib
An Extension of BLANC to System Mentions
Xiaoqiang Luo | Sameer Pradhan | Marta Recasens | Eduard Hovy
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation
Sameer Pradhan | Xiaoqiang Luo | Marta Recasens | Eduard Hovy | Vincent Ng | Michael Strube
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Descending-Path Convolution Kernel for Syntactic Structures
Chen Lin | Timothy Miller | Alvin Kho | Steven Bethard | Dmitriy Dligach | Sameer Pradhan | Guergana Savova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
SemEval-2014 Task 7: Analysis of Clinical Text
Sameer Pradhan | Noémie Elhadad | Wendy Chapman | Suresh Manandhar | Guergana Savova
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
Discovering Temporal Narrative Containers in Clinical Text
Timothy Miller | Steven Bethard | Dmitriy Dligach | Sameer Pradhan | Chen Lin | Guergana Savova
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

pdf bib
Towards Robust Linguistic Analysis using OntoNotes
Sameer Pradhan | Alessandro Moschitti | Nianwen Xue | Hwee Tou Ng | Anders Björkelund | Olga Uryupina | Yuchen Zhang | Zhi Zhong
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

2012

pdf bib
Joint Conference on EMNLP and CoNLL - Shared Task
Sameer Pradhan | Alessandro Moschitti | Nianwen Xue
Joint Conference on EMNLP and CoNLL - Shared Task

pdf bib
CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes
Sameer Pradhan | Alessandro Moschitti | Nianwen Xue | Olga Uryupina | Yuchen Zhang
Joint Conference on EMNLP and CoNLL - Shared Task

2011

pdf bib
Proceedings of the 5th Linguistic Annotation Workshop
Nancy Ide | Adam Meyers | Sameer Pradhan | Katrin Tomanek
Proceedings of the 5th Linguistic Annotation Workshop

pdf bib
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task
Sameer Pradhan
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes
Sameer Pradhan | Lance Ramshaw | Mitchell Marcus | Martha Palmer | Ralph Weischedel | Nianwen Xue
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

2010

pdf bib
The Revised Arabic PropBank
Wajdi Zaghouani | Mona Diab | Aous Mansouri | Sameer Pradhan | Martha Palmer
Proceedings of the Fourth Linguistic Annotation Workshop

pdf bib
SemEval-2010 Task 14: Word Sense Induction &Disambiguation
Suresh Manandhar | Ioannis Klapaftis | Dmitriy Dligach | Sameer Pradhan
Proceedings of the 5th International Workshop on Semantic Evaluation

2009

pdf bib
OntoNotes: The 90% Solution
Sameer S. Pradhan | Nianwen Xue
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts

2008

pdf bib
Towards Robust Semantic Role Labeling
Sameer S. Pradhan | Wayne Ward | James H. Martin
Computational Linguistics, Volume 34, Number 2, June 2008 - Special Issue on Semantic Role Labeling

2007

pdf bib
SemEval-2007 Task-17: English Lexical Sample, SRL and All Words
Sameer Pradhan | Edward Loper | Dmitriy Dligach | Martha Palmer
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Towards Robust Semantic Role Labeling
Sameer Pradhan | Wayne Ward | James Martin
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2005

pdf bib
Semantic Role Labeling Using Different Syntactic Views
Sameer Pradhan | Wayne Ward | Kadri Hacioglu | James Martin | Daniel Jurafsky
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Semantic Role Chunking Combining Complementary Syntactic Views
Sameer Pradhan | Kadri Hacioglu | Wayne Ward | James H. Martin | Daniel Jurafsky
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

2004

pdf bib
Semantic Role Labeling by Tagging Syntactic Chunks
Kadri Hacioglu | Sameer Pradhan | Wayne Ward | James H. Martin | Daniel Jurafsky
Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004

pdf bib
Mixing Weak Learners in Semantic Parsin
Rodney D. Nielsen | Sameer Pradhan
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
Shallow Semantic Parsing using Support Vector Machines
Sameer S. Pradhan | Wayne H. Ward | Kadri Hacioglu | James H. Martin | Dan Jurafsky
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

pdf bib
Parsing Arguments of Nominalizations in English and Chinese
Sameer Pradhan | Honglin Sun | Wayne Ward | James H. Martin | Daniel Jurafsky
Proceedings of HLT-NAACL 2004: Short Papers

2001

pdf bib
University of Colorado Dialogue Systems for Travel and Navigation
B. Pellom | W. Ward | J. Hansen | R. Cole | K. Hacioglu | J. Zhang | X. Yu | S. Pradhan
Proceedings of the First International Conference on Human Language Technology Research

Search