How Lexical is Bilingual Lexicon Induction?

Harsh Kohli; Helian Feng; Nicholas Dronen; Calvin McCarter; Sina Moeini; Ali Kebarighotbi

doi:10.18653/v1/2024.findings-naacl.273

How Lexical is Bilingual Lexicon Induction?

Harsh Kohli, Helian Feng, Nicholas Dronen, Calvin McCarter, Sina Moeini, Ali Kebarighotbi

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair. Recently, retrieve-and-rank approach to BLI has achieved state of the art results on the task. However, the problem remains challenging in low-resource settings, due to the paucity of data. The task is complicated by factors such as lexical variation across languages. We argue that the incorporation of additional lexical information into the recent retrieve-and-rank approach should improve lexicon induction. We demonstrate the efficacy of our proposed approach on XLING, improving over the previous state of the art by an average of 2% across all language pairs.

Anthology ID:: 2024.findings-naacl.273
Volume:: Findings of the Association for Computational Linguistics: NAACL 2024
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4381–4386
Language:
URL:: https://aclanthology.org/2024.findings-naacl.273/
DOI:: 10.18653/v1/2024.findings-naacl.273
Bibkey:
Cite (ACL):: Harsh Kohli, Helian Feng, Nicholas Dronen, Calvin McCarter, Sina Moeini, and Ali Kebarighotbi. 2024. How Lexical is Bilingual Lexicon Induction?. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 4381–4386, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: How Lexical is Bilingual Lexicon Induction? (Kohli et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-naacl.273.pdf
Video:: https://aclanthology.org/2024.findings-naacl.273.mp4

PDF Cite Search Video Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{kohli-etal-2024-lexical,
    title = "How Lexical is Bilingual Lexicon Induction?",
    author = "Kohli, Harsh  and
      Feng, Helian  and
      Dronen, Nicholas  and
      McCarter, Calvin  and
      Moeini, Sina  and
      Kebarighotbi, Ali",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-naacl.273/",
    doi = "10.18653/v1/2024.findings-naacl.273",
    pages = "4381--4386",
    abstract = "In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair. Recently, retrieve-and-rank approach to BLI has achieved state of the art results on the task. However, the problem remains challenging in low-resource settings, due to the paucity of data. The task is complicated by factors such as lexical variation across languages. We argue that the incorporation of additional lexical information into the recent retrieve-and-rank approach should improve lexicon induction. We demonstrate the efficacy of our proposed approach on XLING, improving over the previous state of the art by an average of 2{\%} across all language pairs."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="kohli-etal-2024-lexical">
    <titleInfo>
        <title>How Lexical is Bilingual Lexicon Induction?</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Harsh</namePart>
        <namePart type="family">Kohli</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Helian</namePart>
        <namePart type="family">Feng</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nicholas</namePart>
        <namePart type="family">Dronen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Calvin</namePart>
        <namePart type="family">McCarter</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Sina</namePart>
        <namePart type="family">Moeini</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ali</namePart>
        <namePart type="family">Kebarighotbi</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2024-06</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Findings of the Association for Computational Linguistics: NAACL 2024</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Kevin</namePart>
            <namePart type="family">Duh</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Helena</namePart>
            <namePart type="family">Gomez</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Steven</namePart>
            <namePart type="family">Bethard</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Mexico City, Mexico</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair. Recently, retrieve-and-rank approach to BLI has achieved state of the art results on the task. However, the problem remains challenging in low-resource settings, due to the paucity of data. The task is complicated by factors such as lexical variation across languages. We argue that the incorporation of additional lexical information into the recent retrieve-and-rank approach should improve lexicon induction. We demonstrate the efficacy of our proposed approach on XLING, improving over the previous state of the art by an average of 2% across all language pairs.</abstract>
    <identifier type="citekey">kohli-etal-2024-lexical</identifier>
    <identifier type="doi">10.18653/v1/2024.findings-naacl.273</identifier>
    <location>
        <url>https://aclanthology.org/2024.findings-naacl.273/</url>
    </location>
    <part>
        <date>2024-06</date>
        <extent unit="page">
            <start>4381</start>
            <end>4386</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T How Lexical is Bilingual Lexicon Induction?
%A Kohli, Harsh
%A Feng, Helian
%A Dronen, Nicholas
%A McCarter, Calvin
%A Moeini, Sina
%A Kebarighotbi, Ali
%Y Duh, Kevin
%Y Gomez, Helena
%Y Bethard, Steven
%S Findings of the Association for Computational Linguistics: NAACL 2024
%D 2024
%8 June
%I Association for Computational Linguistics
%C Mexico City, Mexico
%F kohli-etal-2024-lexical
%X In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair. Recently, retrieve-and-rank approach to BLI has achieved state of the art results on the task. However, the problem remains challenging in low-resource settings, due to the paucity of data. The task is complicated by factors such as lexical variation across languages. We argue that the incorporation of additional lexical information into the recent retrieve-and-rank approach should improve lexicon induction. We demonstrate the efficacy of our proposed approach on XLING, improving over the previous state of the art by an average of 2% across all language pairs.
%R 10.18653/v1/2024.findings-naacl.273
%U https://aclanthology.org/2024.findings-naacl.273/
%U https://doi.org/10.18653/v1/2024.findings-naacl.273
%P 4381-4386

Download as File

Markdown (Informal)

[How Lexical is Bilingual Lexicon Induction?](https://aclanthology.org/2024.findings-naacl.273/) (Kohli et al., Findings 2024)

How Lexical is Bilingual Lexicon Induction? (Kohli et al., Findings 2024)

ACL

Harsh Kohli, Helian Feng, Nicholas Dronen, Calvin McCarter, Sina Moeini, and Ali Kebarighotbi. 2024. How Lexical is Bilingual Lexicon Induction?. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 4381–4386, Mexico City, Mexico. Association for Computational Linguistics.