Bogdan Sacaleanu

2023

pdf bib abs
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Steven Y. Feng | Vivek Khetan | Bogdan Sacaleanu | Anatole Gershman | Eduard Hovy
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

We motivate and introduce CHARD: Clinical Health-Aware Reasoning across Dimensions, to investigate the capability of text generation models to act as implicit clinical knowledge bases and generate free-flow textual explanations about various health-related conditions across several dimensions. We collect and present an associated dataset, CHARDat, consisting of explanations about 52 health conditions across three clinical dimensions. We conduct extensive experiments using BART and T5 along with data augmentation, and perform automatic, human, and qualitative analyses. We show that while our models can perform decently, CHARD is very challenging with strong potential for further exploration.

pdf bib
Template Filling for Controllable Commonsense Reasoning
Dheeraj Rajagopal | Vivek Khetan | Bogdan Sacaleanu | Anatole Gershman | Andrew E. Fano Fano | Eduard Hovy
Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)

2022

pdf bib abs
MIMICause: Representation and automatic extraction of causal relation types from clinical notes
Vivek Khetan | Md Imbesat Rizvi | Jessica Huber | Paige Bartusiak | Bogdan Sacaleanu | Andrew Fano
Findings of the Association for Computational Linguistics: ACL 2022

Understanding causal narratives communicated in clinical notes can help make strides towards personalized healthcare. Extracted causal information from clinical notes can be combined with structured EHR data such as patients’ demographics, diagnoses, and medications. This will enhance healthcare providers’ ability to identify aspects of a patient’s story communicated in the clinical notes and help make more informed decisions. In this work, we propose annotation guidelines, develop an annotated corpus and provide baseline scores to identify types and direction of causal relations between a pair of biomedical concepts in clinical notes; communicated implicitly or explicitly, identified either in a single sentence or across multiple sentences. We annotate a total of 2714 de-identified examples sampled from the 2018 n2c2 shared task dataset and train four different language model based architectures. Annotation based on our guidelines achieved a high inter-annotator agreement i.e. Fleiss’ kappa (𝜅) score of 0.72, and our model for identification of causal relations achieved a macro F1 score of 0.56 on the test data. The high inter-annotator agreement for clinical text shows the quality of our annotation guidelines while the provided baseline F1 score sets the direction for future research towards understanding narratives in clinical texts.

2012

pdf bib abs
An Adaptive Framework for Named Entity Combination
Bogdan Sacaleanu | Günter Neumann
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We have developed a new OSGi-based platform for Named Entity Recognition (NER) which uses a voting strategy to combine the results produced by several existing NER systems (currently OpenNLP, LingPipe and Stanford). The different NER systems have been systematically decomposed and modularized into the same pipeline of preprocessing components in order to support a flexible selection and ordering of the NER processing flow. This high modular and component-based design supports the possibility to setup different constellations of chained processing steps including alternative voting strategies for combining the results of parallel running components.

2010

pdf bib abs
Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering
Daniel Sonntag | Bogdan Sacaleanu
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Over the last several years, speech-based question answering (QA) has become very popular in contrast to pure search engine based approaches on a desktop. Open-domain QA systems are now much more powerful and precise, and they can be used in speech applications. Speech-based question answering systems often rely on predefined grammars for speech understanding. In order to improve the coverage of such complex AI systems, we reused speech patterns used to generate textual entailment patterns. These can make multimodal question understanding more robust. We exemplify this in the context of a domain-specific dialogue scenario. As a result, written text input components (e.g., in a textual input field) can deal with more flexible input according to the derived textual entailment patterns. A multimodal QA dialogue spanning over several domains of interest, i.e., personal address book entries, questions about the music domain and politicians and other celebrities, demonstrates how the textual input mode can be used in a multimodal dialogue shell.

2008

2006

This paper presents an overview of the Multilingual Question Answering evaluation campaigns which have been organized at CLEF (Cross Language Evaluation Forum) since 2003. Over the years, the competition has registered a steady increment in the number of participants and languages involved. In fact, from the original eight groups which participated in 2003 QA track, the number of competitors in 2005 rose to twenty-four. Also, the performances of the systems have steadily improved, and the average of the best performances in the 2005 saw an increase of 10% with respect to the previous year.

pdf bib
Cross-Cutting Aspects of Cross-Language Question Answering Systems
Bogdan Sacaleanu | Günter Neumann
Proceedings of the Workshop on Multilingual Question Answering - MLQA ‘06

2004