CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way

Greta Smolenska; Peter Kolb; Sinan Tang; Mironas Bitinis; Héctor Hernández; Elin Asklöv

doi:10.18653/v1/2021.semeval-1.81

CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way

Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, Elin Asklöv

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson’s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson’s score of 0.7925).

Anthology ID:: 2021.semeval-1.81
Volume:: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:: August
Year:: 2021
Address:: Online
Editors:: Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 632–639
Language:
URL:: https://aclanthology.org/2021.semeval-1.81/
DOI:: 10.18653/v1/2021.semeval-1.81
Bibkey:
Cite (ACL):: Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, and Elin Asklöv. 2021. CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 632–639, Online. Association for Computational Linguistics.
Cite (Informal):: CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way (Smolenska et al., SemEval 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.semeval-1.81.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{smolenska-etal-2021-clulex,
    title = "{CLULEX} at {S}em{E}val-2021 Task 1: A Simple System Goes a Long Way",
    author = {Smolenska, Greta  and
      Kolb, Peter  and
      Tang, Sinan  and
      Bitinis, Mironas  and
      Hern{\'a}ndez, H{\'e}ctor  and
      Askl{\"o}v, Elin},
    editor = "Palmer, Alexis  and
      Schneider, Nathan  and
      Schluter, Natalie  and
      Emerson, Guy  and
      Herbelot, Aurelie  and
      Zhu, Xiaodan",
    booktitle = "Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.semeval-1.81/",
    doi = "10.18653/v1/2021.semeval-1.81",
    pages = "632--639",
    abstract = "This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson{'}s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson{'}s score of 0.7925)."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="smolenska-etal-2021-clulex">
    <titleInfo>
        <title>CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Greta</namePart>
        <namePart type="family">Smolenska</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Peter</namePart>
        <namePart type="family">Kolb</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Sinan</namePart>
        <namePart type="family">Tang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Mironas</namePart>
        <namePart type="family">Bitinis</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Héctor</namePart>
        <namePart type="family">Hernández</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Elin</namePart>
        <namePart type="family">Asklöv</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Alexis</namePart>
            <namePart type="family">Palmer</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Nathan</namePart>
            <namePart type="family">Schneider</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Natalie</namePart>
            <namePart type="family">Schluter</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Guy</namePart>
            <namePart type="family">Emerson</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Aurelie</namePart>
            <namePart type="family">Herbelot</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Xiaodan</namePart>
            <namePart type="family">Zhu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Online</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson’s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson’s score of 0.7925).</abstract>
    <identifier type="citekey">smolenska-etal-2021-clulex</identifier>
    <identifier type="doi">10.18653/v1/2021.semeval-1.81</identifier>
    <location>
        <url>https://aclanthology.org/2021.semeval-1.81/</url>
    </location>
    <part>
        <date>2021-08</date>
        <extent unit="page">
            <start>632</start>
            <end>639</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way
%A Smolenska, Greta
%A Kolb, Peter
%A Tang, Sinan
%A Bitinis, Mironas
%A Hernández, Héctor
%A Asklöv, Elin
%Y Palmer, Alexis
%Y Schneider, Nathan
%Y Schluter, Natalie
%Y Emerson, Guy
%Y Herbelot, Aurelie
%Y Zhu, Xiaodan
%S Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
%D 2021
%8 August
%I Association for Computational Linguistics
%C Online
%F smolenska-etal-2021-clulex
%X This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson’s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson’s score of 0.7925).
%R 10.18653/v1/2021.semeval-1.81
%U https://aclanthology.org/2021.semeval-1.81/
%U https://doi.org/10.18653/v1/2021.semeval-1.81
%P 632-639

Download as File

Markdown (Informal)

[CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way](https://aclanthology.org/2021.semeval-1.81/) (Smolenska et al., SemEval 2021)

CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way (Smolenska et al., SemEval 2021)

ACL

Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, and Elin Asklöv. 2021. CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 632–639, Online. Association for Computational Linguistics.