Wojciech Kusa

2023

pdf bib
“Dr LLM, what do I have?”: The Impact of User Beliefs and Prompt Formulation on Health Diagnoses
Wojciech Kusa | Edoardo Mosca | Aldo Lipani
Proceedings of the Third Workshop on NLP for Medical Conversations

pdf bib abs
HEVS-TUW at SemEval-2023 Task 8: Ensemble of Language Models and Rule-based Classifiers for Claims Identification and PICO Extraction
Anjani Dhrangadhariya | Wojciech Kusa | Henning Müller | Allan Hanbury
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the HEVS-TUW team submission to the SemEval-2023 Task 8: Causal Claims. We participated in two subtasks: (1) causal claims detection and (2) PIO identification. For subtask 1, we experimented with an ensemble of weakly supervised question detection and fine-tuned Transformer-based models. For subtask 2 of PIO frame extraction, we used a combination of deep representation learning and a rule-based approach. Our best model for subtask 1 ranks fourth with an F1-score of 65.77%. It shows moderate benefit from ensembling models pre-trained on independent categories. The results for subtask 2 warrant further investigation for improvement.

2022

pdf bib abs
DoSSIER at MedVidQA 2022: Text-based Approaches to Medical Video Answer Localization Problem
Wojciech Kusa | Georgios Peikos | Óscar Espitia | Allan Hanbury | Gabriella Pasi
Proceedings of the 21st Workshop on Biomedical Language Processing

This paper describes our contribution to the Answer Localization track of the MedVidQA 2022 Shared Task. We propose two answer localization approaches that use only textual information extracted from the video. In particular, our approaches exploit the text extracted from the video’s transcripts along with the text displayed in the video’s frames to create a set of features. Having created a set of features that represents a video’s textual information, we employ four different models to measure the similarity between a video’s segment and a corresponding question. Then, we employ two different methods to obtain the start and end times of the identified answer. One of them is based on a random forest regressor, whereas the other one uses an unsupervised peak detection model to detect the answer’s start time. Our findings suggest that for this task, leveraging only text-related features (transmitted either verbally or visually) and using a small amount of training data, lead to significant improvements over the benchmark Video Span Localization model that is based on deep neural networks.

We present a new gold-standard dataset and a benchmark for the Research Theme Identification task, a sub-task of the Scholarly Knowledge Graph Generation shared task, at the 3rd Workshop on Scholarly Document Processing. The objective of the shared task was to label given research papers with research themes from a total of 36 themes. The benchmark was compiled using data drawn from the largest overall assessment of university research output ever undertaken globally (the Research Excellence Framework - 2014). We provide a performance comparison of a transformer-based ensemble, which obtains multiple predictions for a research paper, given its multiple textual fields (e.g. title, abstract, reference), with traditional machine learning models. The ensemble involves enriching the initial data with additional information from open-access digital libraries and Argumentative Zoning techniques (CITATION). It uses a weighted sum aggregation for the multiple predictions to obtain a final single prediction for the given research paper. Both data and the ensemble are publicly available on https://www.kaggle.com/competitions/sdp2022-scholarly-knowledge-graph-generation/data?select=task1_test_no_label.csv and https://github.com/ProjectDoSSIER/sdp2022, respectively.

Large-scale language modeling and natural language prompting have demonstrated exciting capabilities for few and zero shot learning in NLP. However, translating these successes to specialized domains such as biomedicine remains challenging, due in part to biomedical NLP’s significant dataset debt – the technical costs associated with data that are not consistently documented or easily incorporated into popular machine learning frameworks at scale. To assess this debt, we crowdsourced curation of datasheets for 167 biomedical datasets. We find that only 13% of datasets are available via programmatic access and 30% lack any documentation on licensing and permitted reuse. Our dataset catalog is available at: https://tinyurl.com/bigbio22.

2017

pdf bib abs
External Evaluation of Event Extraction Classifiers for Automatic Pathway Curation: An extended study of the mTOR pathway
Wojciech Kusa | Michael Spranger
BioNLP 2017

This paper evaluates the impact of various event extraction systems on automatic pathway curation using the popular mTOR pathway. We quantify the impact of training data sets as well as different machine learning classifiers and show that some improve the quality of automatically extracted pathways.

Co-authors

Venues

bionlp2
nlpmc1
ws1
semeval1
sdp1
show all...

bigscience1