NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA

Moutushi Roy; Dipankar Das

doi:10.18653/v1/2025.nlpai4health-main.10

NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In this work, we present NLP4Health, a unified and reproducible pipeline to accomplish the tasks of multilingual clinical dialogue summarization and question answering (QA). Our system fine-tunes the multilingual sequence-to-sequence model google/mt5-base along with parameter-efficient Low-Rank Adaptation (LoRA) modules to support ten Indian languages. For each clinical dialogue, the model produces (1) a free-text English summary, (2) an English structured key–value (KnV) JSON summary, and (3) QA responses in the dialogue’s original language. We conducted preprocessing, fine-tuning, and inference, and evaluated across QA, textual, and structured metrics, analyzing performance in low-resource settings. The adapter weights, tokenizer, and inference scripts are publicly released to promote transparency and reproducibility.

Anthology ID:: 2025.nlpai4health-main.10
Volume:: NLP-AI4Health
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Arun Zechariah, Balu Krishna S, Dipti Misra Sharma, Hannah Mary Thomas, Joy Mammen, Parameswari Krishnamurthy, Vandan Mujadia
Venues:: NLP-AI4Health | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 93–97
Language:
URL:: https://aclanthology.org/2025.nlpai4health-main.10/
DOI:: 10.18653/v1/2025.nlpai4health-main.10
Bibkey:
Cite (ACL):: Moutushi Roy and Dipankar Das. 2025. NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA. In NLP-AI4Health, pages 93–97, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):: NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA (Roy & Das, NLP-AI4Health 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.nlpai4health-main.10.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{roy-das-2025-nlp4health,
    title = "{NLP}4{H}ealth: Multilingual Clinical Dialogue Summarization and {QA} with m{T}5 and {L}o{RA}",
    author = "Roy, Moutushi  and
      Das, Dipankar",
    editor = "Zechariah, Arun  and
      Krishna S, Balu  and
      Misra Sharma, Dipti  and
      Mary Thomas, Hannah  and
      Mammen, Joy  and
      Krishnamurthy, Parameswari  and
      Mujadia, Vandan",
    booktitle = "NLP-AI4Health",
    month = dec,
    year = "2025",
    address = "Mumbai, India",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.nlpai4health-main.10/",
    doi = "10.18653/v1/2025.nlpai4health-main.10",
    pages = "93--97",
    ISBN = "979-8-89176-315-9",
    abstract = "In this work, we present NLP4Health, a unified and reproducible pipeline to accomplish the tasks of multilingual clinical dialogue summarization and question answering (QA). Our system fine-tunes the multilingual sequence-to-sequence model google/mt5-base along with parameter-efficient Low-Rank Adaptation (LoRA) modules to support ten Indian languages. For each clinical dialogue, the model produces (1) a free-text English summary, (2) an English structured key{--}value (KnV) JSON summary, and (3) QA responses in the dialogue{'}s original language. We conducted preprocessing, fine-tuning, and inference, and evaluated across QA, textual, and structured metrics, analyzing performance in low-resource settings. The adapter weights, tokenizer, and inference scripts are publicly released to promote transparency and reproducibility."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="roy-das-2025-nlp4health">
    <titleInfo>
        <title>NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Moutushi</namePart>
        <namePart type="family">Roy</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Dipankar</namePart>
        <namePart type="family">Das</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-12</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>NLP-AI4Health</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Arun</namePart>
            <namePart type="family">Zechariah</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Balu</namePart>
            <namePart type="family">Krishna S</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Dipti</namePart>
            <namePart type="family">Misra Sharma</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Hannah</namePart>
            <namePart type="family">Mary Thomas</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Joy</namePart>
            <namePart type="family">Mammen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Parameswari</namePart>
            <namePart type="family">Krishnamurthy</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Vandan</namePart>
            <namePart type="family">Mujadia</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Mumbai, India</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-315-9</identifier>
    </relatedItem>
    <abstract>In this work, we present NLP4Health, a unified and reproducible pipeline to accomplish the tasks of multilingual clinical dialogue summarization and question answering (QA). Our system fine-tunes the multilingual sequence-to-sequence model google/mt5-base along with parameter-efficient Low-Rank Adaptation (LoRA) modules to support ten Indian languages. For each clinical dialogue, the model produces (1) a free-text English summary, (2) an English structured key–value (KnV) JSON summary, and (3) QA responses in the dialogue’s original language. We conducted preprocessing, fine-tuning, and inference, and evaluated across QA, textual, and structured metrics, analyzing performance in low-resource settings. The adapter weights, tokenizer, and inference scripts are publicly released to promote transparency and reproducibility.</abstract>
    <identifier type="citekey">roy-das-2025-nlp4health</identifier>
    <identifier type="doi">10.18653/v1/2025.nlpai4health-main.10</identifier>
    <location>
        <url>https://aclanthology.org/2025.nlpai4health-main.10/</url>
    </location>
    <part>
        <date>2025-12</date>
        <extent unit="page">
            <start>93</start>
            <end>97</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA
%A Roy, Moutushi
%A Das, Dipankar
%Y Zechariah, Arun
%Y Krishna S, Balu
%Y Misra Sharma, Dipti
%Y Mary Thomas, Hannah
%Y Mammen, Joy
%Y Krishnamurthy, Parameswari
%Y Mujadia, Vandan
%S NLP-AI4Health
%D 2025
%8 December
%I Association for Computational Linguistics
%C Mumbai, India
%@ 979-8-89176-315-9
%F roy-das-2025-nlp4health
%X In this work, we present NLP4Health, a unified and reproducible pipeline to accomplish the tasks of multilingual clinical dialogue summarization and question answering (QA). Our system fine-tunes the multilingual sequence-to-sequence model google/mt5-base along with parameter-efficient Low-Rank Adaptation (LoRA) modules to support ten Indian languages. For each clinical dialogue, the model produces (1) a free-text English summary, (2) an English structured key–value (KnV) JSON summary, and (3) QA responses in the dialogue’s original language. We conducted preprocessing, fine-tuning, and inference, and evaluated across QA, textual, and structured metrics, analyzing performance in low-resource settings. The adapter weights, tokenizer, and inference scripts are publicly released to promote transparency and reproducibility.
%R 10.18653/v1/2025.nlpai4health-main.10
%U https://aclanthology.org/2025.nlpai4health-main.10/
%U https://doi.org/10.18653/v1/2025.nlpai4health-main.10
%P 93-97

Download as File

Markdown (Informal)

[NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA](https://aclanthology.org/2025.nlpai4health-main.10/) (Roy & Das, NLP-AI4Health 2025)

NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA (Roy & Das, NLP-AI4Health 2025)

ACL

Moutushi Roy and Dipankar Das. 2025. NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA. In NLP-AI4Health, pages 93–97, Mumbai, India. Association for Computational Linguistics.