Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA

Wafia Adouane; Jean-Philippe Bernardy; Simon Dobnik

doi:10.18653/v1/W19-4609

Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA

Wafia Adouane, Jean-Philippe Bernardy, Simon Dobnik

Correct Metadata for

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We explore the extent to which neural networks can learn to identify semantically equivalent sentences from a small variable dataset using an end-to-end training. We collect a new noisy non-standardised user-generated Algerian (ALG) dataset and also translate it to Modern Standard Arabic (MSA) which serves as its regularised counterpart. We compare the performance of various models on both datasets and report the best performing configurations. The results show that relatively simple models composed of 2 LSTM layers outperform by far other more sophisticated attention-based architectures, for both ALG and MSA datasets.

Anthology ID:: W19-4609
Volume:: Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:: WANLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 78–87
Language:
URL:: https://aclanthology.org/W19-4609/
DOI:: 10.18653/v1/W19-4609
Bibkey:
Cite (ACL):: Wafia Adouane, Jean-Philippe Bernardy, and Simon Dobnik. 2019. Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 78–87, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA (Adouane et al., WANLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-4609.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{adouane-etal-2019-neural,
    title = "Neural Models for Detecting Binary Semantic Textual Similarity for {A}lgerian and {MSA}",
    author = "Adouane, Wafia  and
      Bernardy, Jean-Philippe  and
      Dobnik, Simon",
    editor = "El-Hajj, Wassim  and
      Belguith, Lamia Hadrich  and
      Bougares, Fethi  and
      Magdy, Walid  and
      Zitouni, Imed  and
      Tomeh, Nadi  and
      El-Haj, Mahmoud  and
      Zaghouani, Wajdi",
    booktitle = "Proceedings of the Fourth Arabic Natural Language Processing Workshop",
    month = aug,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-4609/",
    doi = "10.18653/v1/W19-4609",
    pages = "78--87",
    abstract = "We explore the extent to which neural networks can learn to identify semantically equivalent sentences from a small variable dataset using an end-to-end training. We collect a new noisy non-standardised user-generated Algerian (ALG) dataset and also translate it to Modern Standard Arabic (MSA) which serves as its regularised counterpart. We compare the performance of various models on both datasets and report the best performing configurations. The results show that relatively simple models composed of 2 LSTM layers outperform by far other more sophisticated attention-based architectures, for both ALG and MSA datasets."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="adouane-etal-2019-neural">
    <titleInfo>
        <title>Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Wafia</namePart>
        <namePart type="family">Adouane</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jean-Philippe</namePart>
        <namePart type="family">Bernardy</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Simon</namePart>
        <namePart type="family">Dobnik</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Fourth Arabic Natural Language Processing Workshop</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Wassim</namePart>
            <namePart type="family">El-Hajj</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Lamia</namePart>
            <namePart type="given">Hadrich</namePart>
            <namePart type="family">Belguith</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Fethi</namePart>
            <namePart type="family">Bougares</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Walid</namePart>
            <namePart type="family">Magdy</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Imed</namePart>
            <namePart type="family">Zitouni</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Nadi</namePart>
            <namePart type="family">Tomeh</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Mahmoud</namePart>
            <namePart type="family">El-Haj</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Wajdi</namePart>
            <namePart type="family">Zaghouani</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Florence, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We explore the extent to which neural networks can learn to identify semantically equivalent sentences from a small variable dataset using an end-to-end training. We collect a new noisy non-standardised user-generated Algerian (ALG) dataset and also translate it to Modern Standard Arabic (MSA) which serves as its regularised counterpart. We compare the performance of various models on both datasets and report the best performing configurations. The results show that relatively simple models composed of 2 LSTM layers outperform by far other more sophisticated attention-based architectures, for both ALG and MSA datasets.</abstract>
    <identifier type="citekey">adouane-etal-2019-neural</identifier>
    <identifier type="doi">10.18653/v1/W19-4609</identifier>
    <location>
        <url>https://aclanthology.org/W19-4609/</url>
    </location>
    <part>
        <date>2019-08</date>
        <extent unit="page">
            <start>78</start>
            <end>87</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA
%A Adouane, Wafia
%A Bernardy, Jean-Philippe
%A Dobnik, Simon
%Y El-Hajj, Wassim
%Y Belguith, Lamia Hadrich
%Y Bougares, Fethi
%Y Magdy, Walid
%Y Zitouni, Imed
%Y Tomeh, Nadi
%Y El-Haj, Mahmoud
%Y Zaghouani, Wajdi
%S Proceedings of the Fourth Arabic Natural Language Processing Workshop
%D 2019
%8 August
%I Association for Computational Linguistics
%C Florence, Italy
%F adouane-etal-2019-neural
%X We explore the extent to which neural networks can learn to identify semantically equivalent sentences from a small variable dataset using an end-to-end training. We collect a new noisy non-standardised user-generated Algerian (ALG) dataset and also translate it to Modern Standard Arabic (MSA) which serves as its regularised counterpart. We compare the performance of various models on both datasets and report the best performing configurations. The results show that relatively simple models composed of 2 LSTM layers outperform by far other more sophisticated attention-based architectures, for both ALG and MSA datasets.
%R 10.18653/v1/W19-4609
%U https://aclanthology.org/W19-4609/
%U https://doi.org/10.18653/v1/W19-4609
%P 78-87

Download as File

Markdown (Informal)

[Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA](https://aclanthology.org/W19-4609/) (Adouane et al., WANLP 2019)

Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA (Adouane et al., WANLP 2019)

ACL

Wafia Adouane, Jean-Philippe Bernardy, and Simon Dobnik. 2019. Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 78–87, Florence, Italy. Association for Computational Linguistics.