Feature Optimization for Predicting Readability of Arabic L1 and L2

Hind Saddiki; Nizar Habash; Violetta Cavalli-Sforza; Muhamed Al-Khalil

doi:10.18653/v1/W18-3703

Feature Optimization for Predicting Readability of Arabic L1 and L2

Hind Saddiki, Nizar Habash, Violetta Cavalli-Sforza, Muhamed Al Khalil

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use ... for bold, ... for italic, ... for underline, <sc>...</sc> for small-caps, <tt>...<tt> for typewriter text, <url>...</url> for URLs, <a href=...> for hyperlinks, and <par/> for paragraph breaks.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8% (75% error reduction from a commonly used baseline). The comparable results for L2 are 72.4% (45% error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.

Anthology ID:: W18-3703
Volume:: Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Month:: July
Year:: 2018
Address:: Melbourne, Australia
Editors:: Yuen-Hsien Tseng, Hsin-Hsi Chen, Vincent Ng, Mamoru Komachi
Venue:: NLP-TEA
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20–29
Language:
URL:: https://aclanthology.org/W18-3703/
DOI:: 10.18653/v1/W18-3703
Bibkey:
Cite (ACL):: Hind Saddiki, Nizar Habash, Violetta Cavalli-Sforza, and Muhamed Al Khalil. 2018. Feature Optimization for Predicting Readability of Arabic L1 and L2. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, pages 20–29, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):: Feature Optimization for Predicting Readability of Arabic L1 and L2 (Saddiki et al., NLP-TEA 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-3703.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{saddiki-etal-2018-feature,
    title = "Feature Optimization for Predicting Readability of {A}rabic {L}1 and {L}2",
    author = "Saddiki, Hind  and
      Habash, Nizar  and
      Cavalli-Sforza, Violetta  and
      Al Khalil, Muhamed",
    editor = "Tseng, Yuen-Hsien  and
      Chen, Hsin-Hsi  and
      Ng, Vincent  and
      Komachi, Mamoru",
    booktitle = "Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W18-3703/",
    doi = "10.18653/v1/W18-3703",
    pages = "20--29",
    abstract = "Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8{\%} (75{\%} error reduction from a commonly used baseline). The comparable results for L2 are 72.4{\%} (45{\%} error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="saddiki-etal-2018-feature">
    <titleInfo>
        <title>Feature Optimization for Predicting Readability of Arabic L1 and L2</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Hind</namePart>
        <namePart type="family">Saddiki</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nizar</namePart>
        <namePart type="family">Habash</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Violetta</namePart>
        <namePart type="family">Cavalli-Sforza</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Muhamed</namePart>
        <namePart type="family">Al Khalil</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2018-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Yuen-Hsien</namePart>
            <namePart type="family">Tseng</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Hsin-Hsi</namePart>
            <namePart type="family">Chen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Vincent</namePart>
            <namePart type="family">Ng</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Mamoru</namePart>
            <namePart type="family">Komachi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Melbourne, Australia</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8% (75% error reduction from a commonly used baseline). The comparable results for L2 are 72.4% (45% error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.</abstract>
    <identifier type="citekey">saddiki-etal-2018-feature</identifier>
    <identifier type="doi">10.18653/v1/W18-3703</identifier>
    <location>
        <url>https://aclanthology.org/W18-3703/</url>
    </location>
    <part>
        <date>2018-07</date>
        <extent unit="page">
            <start>20</start>
            <end>29</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Feature Optimization for Predicting Readability of Arabic L1 and L2
%A Saddiki, Hind
%A Habash, Nizar
%A Cavalli-Sforza, Violetta
%A Al Khalil, Muhamed
%Y Tseng, Yuen-Hsien
%Y Chen, Hsin-Hsi
%Y Ng, Vincent
%Y Komachi, Mamoru
%S Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
%D 2018
%8 July
%I Association for Computational Linguistics
%C Melbourne, Australia
%F saddiki-etal-2018-feature
%X Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8% (75% error reduction from a commonly used baseline). The comparable results for L2 are 72.4% (45% error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.
%R 10.18653/v1/W18-3703
%U https://aclanthology.org/W18-3703/
%U https://doi.org/10.18653/v1/W18-3703
%P 20-29

Download as File

Markdown (Informal)

[Feature Optimization for Predicting Readability of Arabic L1 and L2](https://aclanthology.org/W18-3703/) (Saddiki et al., NLP-TEA 2018)

Feature Optimization for Predicting Readability of Arabic L1 and L2 (Saddiki et al., NLP-TEA 2018)

ACL

Hind Saddiki, Nizar Habash, Violetta Cavalli-Sforza, and Muhamed Al Khalil. 2018. Feature Optimization for Predicting Readability of Arabic L1 and L2. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, pages 20–29, Melbourne, Australia. Association for Computational Linguistics.