Albert M. Lai

Also published as: Albert Lai, Albert M Lai


2024

pdf bib
Document-level Clinical Entity and Relation extraction via Knowledge Base-Guided Generation
Kriti Bhattarai | Inez Y. Oh | Zachary B. Abrams | Albert M. Lai
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

Generative pre-trained transformer (GPT) models have shown promise in clinical entity and relation extraction tasks because of their precise extraction and contextual understanding capability. In this work, we further leverage the Unified Medical Language System (UMLS) knowledge base to accurately identify medical concepts and improve clinical entity and relation extraction at the document level. Our framework selects UMLS concepts relevant to the text and combines them with prompts to guide language models in extracting entities. Our experiments demonstrate that this initial concept mapping and the inclusion of these mapped concepts in the prompts improves extraction results compared to few-shot extraction tasks on generic language models that do not leverage UMLS. Further, our results show that this approach is more effective than the standard Retrieval Augmented Generation (RAG) technique, where retrieved data is compared with prompt embeddings to generate results. Overall, we find that integrating UMLS concepts with GPT models significantly improves entity and relation identification, outperforming the baseline and RAG models. By combining the precise concept mapping capability of knowledge-based approaches like UMLS with the contextual understanding capability of GPT, our method highlights the potential of these approaches in specialized domains like healthcare.

2018

pdf bib
Jointly Embedding Entities and Text with Distant Supervision
Denis Newman-Griffis | Albert M Lai | Eric Fosler-Lussier
Proceedings of the Third Workshop on Representation Learning for NLP

Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms. We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure. Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community. Results on analogy completion and entity sense disambiguation indicate that entities and words capture complementary information that can be effectively combined for downstream use.

2017

pdf bib
Insights into Analogy Completion from the Biomedical Domain
Denis Newman-Griffis | Albert Lai | Eric Fosler-Lussier
BioNLP 2017

Analogy completion has been a popular task in recent years for evaluating the semantic properties of word embeddings, but the standard methodology makes a number of assumptions about analogies that do not always hold, either in recent benchmark datasets or when expanding into other domains. Through an analysis of analogies in the biomedical domain, we identify three assumptions: that of a Single Answer for any given analogy, that the pairs involved describe the Same Relationship, and that each pair is Informative with respect to the other. We propose modifying the standard methodology to relax these assumptions by allowing for multiple correct answers, reporting MAP and MRR in addition to accuracy, and using multiple example pairs. We further present BMASS, a novel dataset for evaluating linguistic regularities in biomedical embeddings, and demonstrate that the relationships described in the dataset pose significant semantic challenges to current word embedding methods.

2016

pdf bib
Identification, characterization, and grounding of gradable terms in clinical text
Chaitanya Shivade | Marie-Catherine de Marneffe | Eric Fosler-Lussier | Albert M. Lai
Proceedings of the 15th Workshop on Biomedical Natural Language Processing

2015

pdf bib
Corpus-based discovery of semantic intensity scales
Chaitanya Shivade | Marie-Catherine de Marneffe | Eric Fosler-Lussier | Albert M. Lai
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Extending NegEx with Kernel Methods for Negation Detection in Clinical Text
Chaitanya Shivade | Marie-Catherine de Marneffe | Eric Fosler-Lussier | Albert M. Lai
Proceedings of the Second Workshop on Extra-Propositional Aspects of Meaning in Computational Semantics (ExProM 2015)

2014

pdf bib
Cross-narrative Temporal Ordering of Medical Events
Preethi Raghavan | Eric Fosler-Lussier | Noémie Elhadad | Albert M. Lai
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Exploring Semi-Supervised Coreference Resolution of Medical Concepts using Semantic and Temporal Features
Preethi Raghavan | Eric Fosler-Lussier | Albert Lai
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Learning to Temporally Order Medical Events in Clinical Text
Preethi Raghavan | Albert Lai | Eric Fosler-Lussier
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Temporal Classification of Medical Events
Preethi Raghavan | Eric Fosler-Lussier | Albert Lai
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing