Hwanmun Kim
2024
Saama Technologies at BioLaySumm: Abstract based fine-tuned models with LoRA
Hwanmun Kim
|
Kamal raj Kanakarajan
|
Malaikannan Sankarasubbu
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Lay summarization of biomedical research articles is a challenging problem due to their use of technical terms and background knowledge requirements, despite the potential benefits of these research articles to the public. We worked on this problem as participating in BioLaySumm 2024. We experimented with various fine-tuning approaches to generate better lay summaries for biomedical research articles. After several experiments, we built a LoRA model with unsupervised fine-tuning based on the abstracts of the given articles, followed by a post-processing unit to take off repeated sentences. Our model was ranked 3rd overall in the BioLaySumm 2024 leaderboard. We analyzed the different approaches we experimented with and suggested several ideas to improve our model further.
Saama Technologies at SemEval-2024 Task 2: Three-module System for NLI4CT Enhanced by LLM-generated Intermediate Labels
Hwanmun Kim
|
Kamal Raj Kanakarajan
|
Malaikannan Sankarasubbu
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Participating in SemEval 2024 Task 2, we built a three-module system to predict entailment labels for NLI4CT, which consists of a sequence of the query generation module, the query answering module, and the aggregation module. We fine-tuned or prompted each module with the intermediate labels we generated with LLMs, and we optimized the combinations of different modules through experiments. Our system is ranked 19th ~ 24th in the SemEval 2024 Task 2 leaderboard in different metrics. We made several interesting observations regarding the correlation between different metrics and the sensitivity of our system on the aggregation module. We performed the error analysis on our system which can potentially help to improve our system further.