Philip Ogren

Also published as: Philip V. Ogren


2024

pdf bib
Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations
Sagi Shaier | Ari Kobren | Philip V. Ogren
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Resolving knowledge conflicts is a crucial challenge in Question Answering (QA) tasks, as the internet contains numerous conflicting facts and opinions. While some research has made progress in tackling ambiguous settings where multiple valid answers exist, these approaches often neglect to provide source citations, leaving users to evaluate the factuality of each answer. On the other hand, existing work on citation generation has focused on unambiguous settings with single answers, failing to address the complexity of real-world scenarios. Despite the importance of both aspects, no prior research has combined them, leaving a significant gap in the development of QA systems. In this work, we bridge this gap by proposing the novel task of QA with source citation in ambiguous settings, where multiple valid answers exist. To facilitate research in this area, we create a comprehensive framework consisting of: (1) five novel datasets, obtained by augmenting three existing reading comprehension datasets with citation meta-data across various ambiguous settings, such as distractors and paraphrasing; (2) the first ambiguous multi-hop QA dataset featuring real-world, naturally occurring contexts; (3) two new metrics to evaluate models’ performances; and (4) several strong baselines using rule-based, prompting, and finetuning approaches over five large language models. We hope that this new task, datasets, metrics, and baselines will inspire the community to push the boundaries of QA research and develop more trustworthy and interpretable systems.

2014

pdf bib
ClearTK 2.0: Design Patterns for Machine Learning in UIMA
Steven Bethard | Philip Ogren | Lee Becker
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.

2010

pdf bib
Improving Syntactic Coordination Resolution using Language Modeling
Philip Ogren
Proceedings of the NAACL HLT 2010 Student Research Workshop

2009

pdf bib
High-precision biological event extraction with a concept recognizer
K. Bretonnel Cohen | Karin Verspoor | Helen Johnson | Chris Roeder | Philip Ogren | William Baumgartner | Elizabeth White | Lawrence Hunter
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
Building Test Suites for UIMA Components
Philip Ogren | Steven Bethard
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

2008

pdf bib
System Evaluation on a Named Entity Corpus from Clinical Notes
Karin Schuler | Vinod Kaggal | James Masanz | Philip Ogren | Guergana Savova
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents the evaluation of the dictionary look-up component of Mayo Clinic’s Information Extraction system. The component was tested on a corpus of 160 free-text clinical notes which were manually annotated with the named entity disease. This kind of clinical text presents many language challenges such as fragmented sentences and heavy use of abbreviations and acronyms. The dictionary used for this evaluation was a subset of SNOMED-CT with semantic types corresponding to diseases/disorders without any augmentation. The algorithm achieves an F-score of 0.56 for exact matches and F-scores of 0.76 and 0.62 for right and left-partial matches respectively. Machine learning techniques are currently under investigation to improve this task.

pdf bib
Constructing Evaluation Corpora for Automated Clinical Named Entity Recognition
Philip Ogren | Guergana Savova | Christopher Chute
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report on the construction of a gold-standard dataset consisting of annotated clinical notes suitable for evaluating our biomedical named entity recognition system. The dataset is the result of consensus between four human annotators and contains 1,556 annotations on 160 clinical notes using 658 unique concept codes from SNOMED-CT corresponding to human disorders. Inter-annotator agreement was calculated on annotations from 100 of the documents for span (90.9%), concept code (81.7%), context (84.8%), and status (86.0%) agreement. Complete agreement for span, concept code, context, and status was 74.6%. We found that creating a consensus set based on annotations from two independently-created annotation sets can reduce inter-annotator disagreement by 32.3%. We found little benefit to pre-annotating the corpus with a third-party named entity recognizer.

2006

pdf bib
Knowtator: A Protégé plug-in for annotated corpus construction
Philip V. Ogren
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Demonstrations

2005

pdf bib
Corpus Design for Biomedical Natural Language Processing
K. Bretonnel Cohen | Lynne Fox | Philip V. Ogren | Lawrence Hunter
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics