W. John Wilbur
Also published as: W John Wilbur
2021
Measuring the relative importance of full text sections for information retrieval from scientific literature.
Lana Yeganova | Won Gyu Kim | Donald Comeau | W John Wilbur | Zhiyong Lu
Proceedings of the 20th Workshop on Biomedical Language Processing
Lana Yeganova | Won Gyu Kim | Donald Comeau | W John Wilbur | Zhiyong Lu
Proceedings of the 20th Workshop on Biomedical Language Processing
With the growing availability of full-text articles, integrating abstracts and full texts of documents into a unified representation is essential for comprehensive search of scientific literature. However, previous studies have shown that naïvely merging abstracts with full texts of articles does not consistently yield better performance. Balancing the contribution of query terms appearing in the abstract and in sections of different importance in full text articles remains a challenge both with traditional bag-of-words IR approaches and for neural retrieval methods. In this work we establish the connection between the BM25 score of a query term appearing in a section of a full text document and the probability of that document being clicked or identified as relevant. Probability is computed using Pool Adjacent Violators (PAV), an isotonic regression algorithm, providing a maximum likelihood estimate based on the observed data. Using this probabilistic transformation of BM25 scores we show an improved performance on the PubMed Click dataset developed and presented in this study, as well as the 2007 TREC Genomics collection.
2018
SingleCite: Towards an improved Single Citation Search in PubMed
Lana Yeganova | Donald C Comeau | Won Kim | W John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2018 workshop
Lana Yeganova | Donald C Comeau | Won Kim | W John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2018 workshop
A search that is targeted at finding a specific document in databases is called a Single Citation search. Single citation searches are particularly important for scholarly databases, such as PubMed, because users are frequently searching for a specific publication. In this work we describe SingleCite, a single citation matching system designed to facilitate user’s search for a specific document. We report on the progress that has been achieved towards building that functionality.
MeSH-based dataset for measuring the relevance of text retrieval
Won Gyu Kim | Lana Yeganova | Donald Comeau | W John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2018 workshop
Won Gyu Kim | Lana Yeganova | Donald Comeau | W John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2018 workshop
Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.
2016
PubTermVariants: biomedical term variants and their use for PubMed search
Lana Yeganova | Won Kim | Sun Kim | Rezarta Islamaj Doğan | Wanli Liu | Donald C Comeau | Zhiyong Lu | W John Wilbur
Proceedings of the 15th Workshop on Biomedical Natural Language Processing
Lana Yeganova | Won Kim | Sun Kim | Rezarta Islamaj Doğan | Wanli Liu | Donald C Comeau | Zhiyong Lu | W John Wilbur
Proceedings of the 15th Workshop on Biomedical Natural Language Processing
2015
Summarizing Topical Contents from PubMed Documents Using a Thematic Analysis
Sun Kim | Lana Yeganova | W. John Wilbur
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Sun Kim | Lana Yeganova | W. John Wilbur
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2013
Extracting Biomedical Events and Modifications Using Subgraph Matching with Noisy Training Data
Andrew MacKinlay | David Martinez | Antonio Jimeno Yepes | Haibin Liu | W. John Wilbur | Karin Verspoor
Proceedings of the BioNLP Shared Task 2013 Workshop
Andrew MacKinlay | David Martinez | Antonio Jimeno Yepes | Haibin Liu | W. John Wilbur | Karin Verspoor
Proceedings of the BioNLP Shared Task 2013 Workshop
Generalizing an Approximate Subgraph Matching-based System to Extract Events in Molecular Biology and Cancer Genetics
Haibin Liu | Karin Verspoor | Donald C. Comeau | Andrew MacKinlay | W. John Wilbur
Proceedings of the BioNLP Shared Task 2013 Workshop
Haibin Liu | Karin Verspoor | Donald C. Comeau | Andrew MacKinlay | W. John Wilbur
Proceedings of the BioNLP Shared Task 2013 Workshop
BioNLP Shared Task 2013: Supporting Resources
Pontus Stenetorp | Wiktoria Golik | Thierry Hamon | Donald C. Comeau | Rezarta Islamaj Doğan | Haibin Liu | W. John Wilbur
Proceedings of the BioNLP Shared Task 2013 Workshop
Pontus Stenetorp | Wiktoria Golik | Thierry Hamon | Donald C. Comeau | Rezarta Islamaj Doğan | Haibin Liu | W. John Wilbur
Proceedings of the BioNLP Shared Task 2013 Workshop
2012
Classifying Gene Sentences in Biomedical Literature by Combining High-Precision Gene Identifiers
Sun Kim | Won Kim | Don Comeau | W. John Wilbur
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Sun Kim | Won Kim | Don Comeau | W. John Wilbur
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
2011
Automatic extraction of data deposition statements: where do the research results go?
Aurélie Névéol | W. John Wilbur | Zhiyong Lu
Proceedings of BioNLP 2011 Workshop
Aurélie Névéol | W. John Wilbur | Zhiyong Lu
Proceedings of BioNLP 2011 Workshop
Text Mining Techniques for Leveraging Positively Labeled Data
Lana Yeganova | Donald C. Comeau | Won Kim | W. John Wilbur
Proceedings of BioNLP 2011 Workshop
Lana Yeganova | Donald C. Comeau | Won Kim | W. John Wilbur
Proceedings of BioNLP 2011 Workshop
2009
Exploring Two Biomedical Text Genres for Disease Recognition
Aurélie Névéol | Won Kim | W. John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2009 Workshop
Aurélie Névéol | Won Kim | W. John Wilbur | Zhiyong Lu
Proceedings of the BioNLP 2009 Workshop
2007
Unsupervised Learning of the Morpho-Semantic Relationship in MEDLINE
W. John Wilbur
Biological, translational, and clinical language processing
W. John Wilbur
Biological, translational, and clinical language processing
2006
A Priority Model for Named Entities
Lorraine Tanabe | W. John Wilbur
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
Lorraine Tanabe | W. John Wilbur
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
2005
MedTag: A Collection of Biomedical Annotations
Lawrence H. Smith | Lorraine Tanabe | Thomas Rindflesch | W. John Wilbur
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Lawrence H. Smith | Lorraine Tanabe | Thomas Rindflesch | W. John Wilbur
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics