2025
pdf
bib
abs
ELAINE-medLLM: Lightweight English Japanese Chinese Trilingual Large Language Model for Bio-medical Domain
Ken Yano
|
Zheheng Luo
|
Jimin Huang
|
Qianqian Xie
|
Masaki Asada
|
Chenhan Yuan
|
Kailai Yang
|
Makoto Miwa
|
Sophia Ananiadou
|
Jun’ichi Tsujii
Proceedings of the 31st International Conference on Computational Linguistics
We propose ELAINE (EngLish-jApanese-chINesE)-medLLM, a trilingual (English, Japanese, Chinese) large language model adapted for the bio-medical domain based on Llama-3-8B. The training dataset was carefully curated in terms of volume and diversity to adapt to the biomedical domain and endow trilingual capability while preserving the knowledge and abilities of the base model. The training follows 2-stage paths: continued pre-training and supervised fine-tuning (SFT). Our results demonstrate that ELAINE-medLLM exhibits superior trilingual capabilities compared to existing bilingual or multilingual medical LLMs without severely sacrificing the base model’s capability.
pdf
bib
abs
Improving Relation Extraction by Sequence-to-sequence-based Dependency Parsing Pre-training
Masaki Asada
|
Makoto Miwa
Proceedings of the 31st International Conference on Computational Linguistics
Relation extraction is a crucial natural language processing task that extracts relational triplets from raw text. Syntactic dependencies information has shown its effectiveness for relation extraction tasks. However, in most existing studies, dependency information is used only for traditional encoder-only-based relation extraction, not for generative sequence-to-sequence (seq2seq)-based relation extraction. In this study, we propose a syntax-aware seq2seq pre-trained model for seq2seq-based relation extraction. The model incorporates dependency information into a seq2seq pre-trained language model by continual pre-training with a seq2seq-based dependency parsing task. Experimental results on two widely used relation extraction benchmark datasets show that dependency parsing pre-training can improve the relation extraction performance.
pdf
bib
abs
Addressing the Training-Inference Discrepancy in Discrete Diffusion for Text Generation
Masaki Asada
|
Makoto Miwa
Proceedings of the 31st International Conference on Computational Linguistics
This study addresses the discrepancy between training and inference in discrete diffusion models for text generation. We propose two novel strategies: (1) a training schema that considers two-step diffusion processes, allowing the model to use its own predicted output as input for subsequent steps during training and (2) a scheduling technique that gradually increases the probability of using self-generated text as training progresses. Experiments conducted on four widely used text generation benchmark datasets demonstrate that both proposed strategies improve the performance of discrete diffusion models in text generation.
2023
pdf
bib
abs
BioNART: A Biomedical Non-AutoRegressive Transformer for Natural Language Generation
Masaki Asada
|
Makoto Miwa
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
We propose a novel Biomedical domain-specific Non-AutoRegressive Transformer model for natural language generation: BioNART. Our BioNART is based on an encoder-decoder model, and both encoder and decoder are compatible with widely used BERT architecture, which allows benefiting from publicly available pre-trained biomedical language model checkpoints. We performed additional pre-training and fine-tuned BioNART on biomedical summarization and doctor-patient dialogue tasks. Experimental results show that our BioNART achieves about 94% of the ROUGE score to the pre-trained autoregressive model while realizing an 18 times faster inference speed on the iCliniq dataset.
2018
pdf
bib
abs
Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information
Masaki Asada
|
Makoto Miwa
|
Yutaka Sasaki
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
We propose a novel neural method to extract drug-drug interactions (DDIs) from texts using external drug molecular structure information. We encode textual drug pairs with convolutional neural networks and their molecular pairs with graph convolutional networks (GCNs), and then we concatenate the outputs of these two networks. In the experiments, we show that GCNs can predict DDIs from the molecular structures of drugs in high accuracy and the molecular information can enhance text-based DDI extraction by 2.39 percent points in the F-score on the DDIExtraction 2013 shared task data set.
2017
pdf
bib
abs
Extracting Drug-Drug Interactions with Attention CNNs
Masaki Asada
|
Makoto Miwa
|
Yutaka Sasaki
BioNLP 2017
We propose a novel attention mechanism for a Convolutional Neural Network (CNN)-based Drug-Drug Interaction (DDI) extraction model. CNNs have been shown to have a great potential on DDI extraction tasks; however, attention mechanisms, which emphasize important words in the sentence of a target-entity pair, have not been investigated with the CNNs despite the fact that attention mechanisms are shown to be effective for a general domain relation classification task. We evaluated our model on the Task 9.2 of the DDIExtraction-2013 shared task. As a result, our attention mechanism improved the performance of our base CNN-based DDI model, and the model achieved an F-score of 69.12%, which is competitive with the state-of-the-art models.