ASR Under Noise: Exploring Robustness for Sundanese and Javanese

Salsabila Zahirah Pranida; Rifo Ahmad Genadi; Muhammad Cendekia Airlangga; Shady Shehata

ASR Under Noise: Exploring Robustness for Sundanese and Javanese

Salsabila Zahirah Pranida, Rifo Ahmad Genadi, Muhammad Cendekia Airlangga, Shady Shehata

Correct Metadata for

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We investigate the robustness of Whisper-based automatic speech recognition (ASR) models for two major Indonesian regional languages: Javanese and Sundanese. While recent work has demonstrated strong ASR performance under clean conditions, their effectiveness in noisy environments remains unclear. To address this, we experiment with multiple training strategies, including synthetic noise augmentation and SpecAugment, and evaluate performance across a range of signal-to-noise ratios (SNRs). Our results show that noise-aware training substantially improves robustness, particularly for larger Whisper models. A detailed error analysis further reveals language-specific challenges, highlighting avenues for future improvements.

Anthology ID:: 2025.winlp-main.16
Volume:: Proceedings of the 9th Widening NLP Workshop
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Chen Zhang, Emily Allaway, Hua Shen, Lesly Miculicich, Yinqiao Li, Meryem M'hamdi, Peerat Limkonchotiwat, Richard He Bai, Santosh T.y.s.s., Sophia Simeng Han, Surendrabikram Thapa, Wiem Ben Rim
Venues:: WiNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 87–99
Language:
URL:: https://aclanthology.org/2025.winlp-main.16/
DOI:
Bibkey:
Cite (ACL):: Salsabila Zahirah Pranida, Rifo Ahmad Genadi, Muhammad Cendekia Airlangga, and Shady Shehata. 2025. ASR Under Noise: Exploring Robustness for Sundanese and Javanese. In Proceedings of the 9th Widening NLP Workshop, pages 87–99, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: ASR Under Noise: Exploring Robustness for Sundanese and Javanese (Pranida et al., WiNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.winlp-main.16.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{pranida-etal-2025-asr,
    title = "{ASR} Under Noise: Exploring Robustness for {S}undanese and {J}avanese",
    author = "Pranida, Salsabila Zahirah  and
      Genadi, Rifo Ahmad  and
      Airlangga, Muhammad Cendekia  and
      Shehata, Shady",
    editor = "Zhang, Chen  and
      Allaway, Emily  and
      Shen, Hua  and
      Miculicich, Lesly  and
      Li, Yinqiao  and
      M'hamdi, Meryem  and
      Limkonchotiwat, Peerat  and
      Bai, Richard He  and
      T.y.s.s., Santosh  and
      Han, Sophia Simeng  and
      Thapa, Surendrabikram  and
      Rim, Wiem Ben",
    booktitle = "Proceedings of the 9th Widening NLP Workshop",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.winlp-main.16/",
    pages = "87--99",
    ISBN = "979-8-89176-351-7",
    abstract = "We investigate the robustness of Whisper-based automatic speech recognition (ASR) models for two major Indonesian regional languages: Javanese and Sundanese. While recent work has demonstrated strong ASR performance under clean conditions, their effectiveness in noisy environments remains unclear. To address this, we experiment with multiple training strategies, including synthetic noise augmentation and SpecAugment, and evaluate performance across a range of signal-to-noise ratios (SNRs). Our results show that noise-aware training substantially improves robustness, particularly for larger Whisper models. A detailed error analysis further reveals language-specific challenges, highlighting avenues for future improvements."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="pranida-etal-2025-asr">
    <titleInfo>
        <title>ASR Under Noise: Exploring Robustness for Sundanese and Javanese</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Salsabila</namePart>
        <namePart type="given">Zahirah</namePart>
        <namePart type="family">Pranida</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Rifo</namePart>
        <namePart type="given">Ahmad</namePart>
        <namePart type="family">Genadi</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Muhammad</namePart>
        <namePart type="given">Cendekia</namePart>
        <namePart type="family">Airlangga</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Shady</namePart>
        <namePart type="family">Shehata</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-11</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 9th Widening NLP Workshop</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Chen</namePart>
            <namePart type="family">Zhang</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Emily</namePart>
            <namePart type="family">Allaway</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Hua</namePart>
            <namePart type="family">Shen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Lesly</namePart>
            <namePart type="family">Miculicich</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Yinqiao</namePart>
            <namePart type="family">Li</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Meryem</namePart>
            <namePart type="family">M’hamdi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Peerat</namePart>
            <namePart type="family">Limkonchotiwat</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Richard</namePart>
            <namePart type="given">He</namePart>
            <namePart type="family">Bai</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Santosh</namePart>
            <namePart type="family">T.y.s.s.</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sophia</namePart>
            <namePart type="given">Simeng</namePart>
            <namePart type="family">Han</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Surendrabikram</namePart>
            <namePart type="family">Thapa</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Wiem</namePart>
            <namePart type="given">Ben</namePart>
            <namePart type="family">Rim</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Suzhou, China</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-351-7</identifier>
    </relatedItem>
    <abstract>We investigate the robustness of Whisper-based automatic speech recognition (ASR) models for two major Indonesian regional languages: Javanese and Sundanese. While recent work has demonstrated strong ASR performance under clean conditions, their effectiveness in noisy environments remains unclear. To address this, we experiment with multiple training strategies, including synthetic noise augmentation and SpecAugment, and evaluate performance across a range of signal-to-noise ratios (SNRs). Our results show that noise-aware training substantially improves robustness, particularly for larger Whisper models. A detailed error analysis further reveals language-specific challenges, highlighting avenues for future improvements.</abstract>
    <identifier type="citekey">pranida-etal-2025-asr</identifier>
    <location>
        <url>https://aclanthology.org/2025.winlp-main.16/</url>
    </location>
    <part>
        <date>2025-11</date>
        <extent unit="page">
            <start>87</start>
            <end>99</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T ASR Under Noise: Exploring Robustness for Sundanese and Javanese
%A Pranida, Salsabila Zahirah
%A Genadi, Rifo Ahmad
%A Airlangga, Muhammad Cendekia
%A Shehata, Shady
%Y Zhang, Chen
%Y Allaway, Emily
%Y Shen, Hua
%Y Miculicich, Lesly
%Y Li, Yinqiao
%Y M’hamdi, Meryem
%Y Limkonchotiwat, Peerat
%Y Bai, Richard He
%Y T.y.s.s., Santosh
%Y Han, Sophia Simeng
%Y Thapa, Surendrabikram
%Y Rim, Wiem Ben
%S Proceedings of the 9th Widening NLP Workshop
%D 2025
%8 November
%I Association for Computational Linguistics
%C Suzhou, China
%@ 979-8-89176-351-7
%F pranida-etal-2025-asr
%X We investigate the robustness of Whisper-based automatic speech recognition (ASR) models for two major Indonesian regional languages: Javanese and Sundanese. While recent work has demonstrated strong ASR performance under clean conditions, their effectiveness in noisy environments remains unclear. To address this, we experiment with multiple training strategies, including synthetic noise augmentation and SpecAugment, and evaluate performance across a range of signal-to-noise ratios (SNRs). Our results show that noise-aware training substantially improves robustness, particularly for larger Whisper models. A detailed error analysis further reveals language-specific challenges, highlighting avenues for future improvements.
%U https://aclanthology.org/2025.winlp-main.16/
%P 87-99

Download as File

Markdown (Informal)

[ASR Under Noise: Exploring Robustness for Sundanese and Javanese](https://aclanthology.org/2025.winlp-main.16/) (Pranida et al., WiNLP 2025)

ASR Under Noise: Exploring Robustness for Sundanese and Javanese (Pranida et al., WiNLP 2025)

ACL

Salsabila Zahirah Pranida, Rifo Ahmad Genadi, Muhammad Cendekia Airlangga, and Shady Shehata. 2025. ASR Under Noise: Exploring Robustness for Sundanese and Javanese. In Proceedings of the 9th Widening NLP Workshop, pages 87–99, Suzhou, China. Association for Computational Linguistics.