Profiling neural grammar induction on morphemically tokenised child-directed speech

Mila Marcheva; Theresa Biberauer; Weiwei Sun

doi:10.18653/v1/2025.cmcl-1.7

Profiling neural grammar induction on morphemically tokenised child-directed speech

Mila Marcheva, Theresa Biberauer, Weiwei Sun

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We investigate the performance of state-of-the-art (SotA) neural grammar induction (GI) models on a morphemically tokenised English dataset based on the CHILDES treebank (Pearl and Sprouse, 2013). Using implementations from Yang et al. (2021a), we train models and evaluate them with the standard F1 score. We introduce novel evaluation metrics—depth-of-morpheme and sibling-of-morpheme—which measure phenomena around bound morpheme attachment. Our results reveal that models with the highest F1 scores do not necessarily induce linguistically plausible structures for bound morpheme attachment, highlighting a key challenge for cognitively plausible GI.

Anthology ID:: 2025.cmcl-1.7
Volume:: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico, USA
Editors:: Tatsuki Kuribayashi, Giulia Rambelli, Ece Takmaz, Philipp Wicke, Jixing Li, Byung-Doh Oh
Venues:: CMCL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 47–54
Language:
URL:: https://aclanthology.org/2025.cmcl-1.7/
DOI:: 10.18653/v1/2025.cmcl-1.7
Bibkey:
Cite (ACL):: Mila Marcheva, Theresa Biberauer, and Weiwei Sun. 2025. Profiling neural grammar induction on morphemically tokenised child-directed speech. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 47–54, Albuquerque, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):: Profiling neural grammar induction on morphemically tokenised child-directed speech (Marcheva et al., CMCL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.cmcl-1.7.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{marcheva-etal-2025-profiling,
    title = "Profiling neural grammar induction on morphemically tokenised child-directed speech",
    author = "Marcheva, Mila  and
      Biberauer, Theresa  and
      Sun, Weiwei",
    editor = "Kuribayashi, Tatsuki  and
      Rambelli, Giulia  and
      Takmaz, Ece  and
      Wicke, Philipp  and
      Li, Jixing  and
      Oh, Byung-Doh",
    booktitle = "Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics",
    month = may,
    year = "2025",
    address = "Albuquerque, New Mexico, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.cmcl-1.7/",
    doi = "10.18653/v1/2025.cmcl-1.7",
    pages = "47--54",
    ISBN = "979-8-89176-227-5",
    abstract = "We investigate the performance of state-of-the-art (SotA) neural grammar induction (GI) models on a morphemically tokenised English dataset based on the CHILDES treebank (Pearl and Sprouse, 2013). Using implementations from Yang et al. (2021a), we train models and evaluate them with the standard F1 score. We introduce novel evaluation metrics{---}depth-of-morpheme and sibling-of-morpheme{---}which measure phenomena around bound morpheme attachment. Our results reveal that models with the highest F1 scores do not necessarily induce linguistically plausible structures for bound morpheme attachment, highlighting a key challenge for cognitively plausible GI."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="marcheva-etal-2025-profiling">
    <titleInfo>
        <title>Profiling neural grammar induction on morphemically tokenised child-directed speech</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Mila</namePart>
        <namePart type="family">Marcheva</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Theresa</namePart>
        <namePart type="family">Biberauer</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Weiwei</namePart>
        <namePart type="family">Sun</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Tatsuki</namePart>
            <namePart type="family">Kuribayashi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Giulia</namePart>
            <namePart type="family">Rambelli</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ece</namePart>
            <namePart type="family">Takmaz</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Philipp</namePart>
            <namePart type="family">Wicke</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jixing</namePart>
            <namePart type="family">Li</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Byung-Doh</namePart>
            <namePart type="family">Oh</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Albuquerque, New Mexico, USA</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-227-5</identifier>
    </relatedItem>
    <abstract>We investigate the performance of state-of-the-art (SotA) neural grammar induction (GI) models on a morphemically tokenised English dataset based on the CHILDES treebank (Pearl and Sprouse, 2013). Using implementations from Yang et al. (2021a), we train models and evaluate them with the standard F1 score. We introduce novel evaluation metrics—depth-of-morpheme and sibling-of-morpheme—which measure phenomena around bound morpheme attachment. Our results reveal that models with the highest F1 scores do not necessarily induce linguistically plausible structures for bound morpheme attachment, highlighting a key challenge for cognitively plausible GI.</abstract>
    <identifier type="citekey">marcheva-etal-2025-profiling</identifier>
    <identifier type="doi">10.18653/v1/2025.cmcl-1.7</identifier>
    <location>
        <url>https://aclanthology.org/2025.cmcl-1.7/</url>
    </location>
    <part>
        <date>2025-05</date>
        <extent unit="page">
            <start>47</start>
            <end>54</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Profiling neural grammar induction on morphemically tokenised child-directed speech
%A Marcheva, Mila
%A Biberauer, Theresa
%A Sun, Weiwei
%Y Kuribayashi, Tatsuki
%Y Rambelli, Giulia
%Y Takmaz, Ece
%Y Wicke, Philipp
%Y Li, Jixing
%Y Oh, Byung-Doh
%S Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
%D 2025
%8 May
%I Association for Computational Linguistics
%C Albuquerque, New Mexico, USA
%@ 979-8-89176-227-5
%F marcheva-etal-2025-profiling
%X We investigate the performance of state-of-the-art (SotA) neural grammar induction (GI) models on a morphemically tokenised English dataset based on the CHILDES treebank (Pearl and Sprouse, 2013). Using implementations from Yang et al. (2021a), we train models and evaluate them with the standard F1 score. We introduce novel evaluation metrics—depth-of-morpheme and sibling-of-morpheme—which measure phenomena around bound morpheme attachment. Our results reveal that models with the highest F1 scores do not necessarily induce linguistically plausible structures for bound morpheme attachment, highlighting a key challenge for cognitively plausible GI.
%R 10.18653/v1/2025.cmcl-1.7
%U https://aclanthology.org/2025.cmcl-1.7/
%U https://doi.org/10.18653/v1/2025.cmcl-1.7
%P 47-54

Download as File

Markdown (Informal)

[Profiling neural grammar induction on morphemically tokenised child-directed speech](https://aclanthology.org/2025.cmcl-1.7/) (Marcheva et al., CMCL 2025)

Profiling neural grammar induction on morphemically tokenised child-directed speech (Marcheva et al., CMCL 2025)

ACL

Mila Marcheva, Theresa Biberauer, and Weiwei Sun. 2025. Profiling neural grammar induction on morphemically tokenised child-directed speech. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 47–54, Albuquerque, New Mexico, USA. Association for Computational Linguistics.