@article{gerz-etal-2018-language,
    title = "Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction",
    author = "Gerz, Daniela  and
      Vuli{\'c}, Ivan  and
      Ponti, Edoardo  and
      Naradowsky, Jason  and
      Reichart, Roi  and
      Korhonen, Anna",
    editor = "Lee, Lillian  and
      Johnson, Mark  and
      Toutanova, Kristina  and
      Roark, Brian",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "6",
    year = "2018",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q18-1032/",
    doi = "10.1162/tacl_a_00032",
    pages = "451--465",
    abstract = "Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="gerz-etal-2018-language">
    <titleInfo>
        <title>Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Daniela</namePart>
        <namePart type="family">Gerz</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ivan</namePart>
        <namePart type="family">Vulić</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Edoardo</namePart>
        <namePart type="family">Ponti</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jason</namePart>
        <namePart type="family">Naradowsky</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Roi</namePart>
        <namePart type="family">Reichart</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Anna</namePart>
        <namePart type="family">Korhonen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2018</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <genre authority="bibutilsgt">journal article</genre>
    <relatedItem type="host">
        <titleInfo>
            <title>Transactions of the Association for Computational Linguistics</title>
        </titleInfo>
        <originInfo>
            <issuance>continuing</issuance>
            <publisher>MIT Press</publisher>
            <place>
                <placeTerm type="text">Cambridge, MA</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">periodical</genre>
        <genre authority="bibutilsgt">academic journal</genre>
    </relatedItem>
    <abstract>Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.</abstract>
    <identifier type="citekey">gerz-etal-2018-language</identifier>
    <identifier type="doi">10.1162/tacl_a_00032</identifier>
    <location>
        <url>https://aclanthology.org/Q18-1032/</url>
    </location>
    <part>
        <date>2018</date>
        <detail type="volume"><number>6</number></detail>
        <extent unit="page">
            <start>451</start>
            <end>465</end>
        </extent>
    </part>
</mods>
</modsCollection>
%0 Journal Article
%T Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction
%A Gerz, Daniela
%A Vulić, Ivan
%A Ponti, Edoardo
%A Naradowsky, Jason
%A Reichart, Roi
%A Korhonen, Anna
%J Transactions of the Association for Computational Linguistics
%D 2018
%V 6
%I MIT Press
%C Cambridge, MA
%F gerz-etal-2018-language
%X Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.
%R 10.1162/tacl_a_00032
%U https://aclanthology.org/Q18-1032/
%U https://doi.org/10.1162/tacl_a_00032
%P 451-465
Markdown (Informal)
[Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction](https://aclanthology.org/Q18-1032/) (Gerz et al., TACL 2018)
ACL