Bi-dialectal ASR of Armenian from Naturalistic and Read Speech

Malajyan Arthur; Victoria Khurshudyan; Karen Avetisyan; Hossep Dolatian; Damien Nouvel

Bi-dialectal ASR of Armenian from Naturalistic and Read Speech

Malajyan Arthur, Victoria Khurshudyan, Karen Avetisyan, Hossep Dolatian, Damien Nouvel

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

The paper explores the development of Automatic Speech Recognition (ASR) models for Armenian, by using data from two standard dialects (Eastern Armenian and Western Armenian). The goal is to develop a joint bi-variational model. We achieve state-of-the-art results. Results from our ASR experiments demonstrate the impact of dataset selection and data volume on model performance. The study reveals limited transferability between dialects, although integrating datasets from both dialects enhances overall performance. The paper underscores the importance of dataset diversity and volume in ASR model training for under-resourced languages like Armenian.

Anthology ID:: 2024.sigul-1.27
Volume:: Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Maite Melero, Sakriani Sakti, Claudia Soria
Venues:: SIGUL | WS
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 227–236
Language:
URL:: https://aclanthology.org/2024.sigul-1.27/
DOI:
Bibkey:
Cite (ACL):: Malajyan Arthur, Victoria Khurshudyan, Karen Avetisyan, Hossep Dolatian, and Damien Nouvel. 2024. Bi-dialectal ASR of Armenian from Naturalistic and Read Speech. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 227–236, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Bi-dialectal ASR of Armenian from Naturalistic and Read Speech (Arthur et al., SIGUL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.sigul-1.27.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{arthur-etal-2024-multi,
    title = "Bi-dialectal {ASR} of {A}rmenian from Naturalistic and Read Speech",
    author = "Arthur, Malajyan  and
      Khurshudyan, Victoria  and
      Avetisyan, Karen  and
      Dolatian, Hossep  and
      Nouvel, Damien",
    editor = "Melero, Maite  and
      Sakti, Sakriani  and
      Soria, Claudia",
    booktitle = "Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.sigul-1.27/",
    pages = "227--236",
    abstract = "The paper explores the development of Automatic Speech Recognition (ASR) models for Armenian, by using data from two standard dialects (Eastern Armenian and Western Armenian). The goal is to develop a joint bi-variational model. We achieve state-of-the-art results. Results from our ASR experiments demonstrate the impact of dataset selection and data volume on model performance. The study reveals limited transferability between dialects, although integrating datasets from both dialects enhances overall performance. The paper underscores the importance of dataset diversity and volume in ASR model training for under-resourced languages like Armenian."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="arthur-etal-2024-multi">
    <titleInfo>
        <title>Bi-dialectal ASR of Armenian from Naturalistic and Read Speech</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Malajyan</namePart>
        <namePart type="family">Arthur</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Victoria</namePart>
        <namePart type="family">Khurshudyan</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Karen</namePart>
        <namePart type="family">Avetisyan</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Hossep</namePart>
        <namePart type="family">Dolatian</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Damien</namePart>
        <namePart type="family">Nouvel</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2024-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Maite</namePart>
            <namePart type="family">Melero</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sakriani</namePart>
            <namePart type="family">Sakti</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Claudia</namePart>
            <namePart type="family">Soria</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>ELRA and ICCL</publisher>
            <place>
                <placeTerm type="text">Torino, Italia</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>The paper explores the development of Automatic Speech Recognition (ASR) models for Armenian, by using data from two standard dialects (Eastern Armenian and Western Armenian). The goal is to develop a joint bi-variational model. We achieve state-of-the-art results. Results from our ASR experiments demonstrate the impact of dataset selection and data volume on model performance. The study reveals limited transferability between dialects, although integrating datasets from both dialects enhances overall performance. The paper underscores the importance of dataset diversity and volume in ASR model training for under-resourced languages like Armenian.</abstract>
    <identifier type="citekey">arthur-etal-2024-multi</identifier>
    <location>
        <url>https://aclanthology.org/2024.sigul-1.27/</url>
    </location>
    <part>
        <date>2024-05</date>
        <extent unit="page">
            <start>227</start>
            <end>236</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Bi-dialectal ASR of Armenian from Naturalistic and Read Speech
%A Arthur, Malajyan
%A Khurshudyan, Victoria
%A Avetisyan, Karen
%A Dolatian, Hossep
%A Nouvel, Damien
%Y Melero, Maite
%Y Sakti, Sakriani
%Y Soria, Claudia
%S Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
%D 2024
%8 May
%I ELRA and ICCL
%C Torino, Italia
%F arthur-etal-2024-multi
%X The paper explores the development of Automatic Speech Recognition (ASR) models for Armenian, by using data from two standard dialects (Eastern Armenian and Western Armenian). The goal is to develop a joint bi-variational model. We achieve state-of-the-art results. Results from our ASR experiments demonstrate the impact of dataset selection and data volume on model performance. The study reveals limited transferability between dialects, although integrating datasets from both dialects enhances overall performance. The paper underscores the importance of dataset diversity and volume in ASR model training for under-resourced languages like Armenian.
%U https://aclanthology.org/2024.sigul-1.27/
%P 227-236

Download as File

Markdown (Informal)

[Bi-dialectal ASR of Armenian from Naturalistic and Read Speech](https://aclanthology.org/2024.sigul-1.27/) (Arthur et al., SIGUL 2024)

Bi-dialectal ASR of Armenian from Naturalistic and Read Speech (Arthur et al., SIGUL 2024)

ACL

Malajyan Arthur, Victoria Khurshudyan, Karen Avetisyan, Hossep Dolatian, and Damien Nouvel. 2024. Bi-dialectal ASR of Armenian from Naturalistic and Read Speech. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 227–236, Torino, Italia. ELRA and ICCL.