Comparison of Short-Text Sentiment Analysis Methods for Croatian

Leon Rotim; Jan Šnajder

doi:10.18653/v1/W17-1411

Comparison of Short-Text Sentiment Analysis Methods for Croatian

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines.

Anthology ID:: W17-1411
Volume:: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Month:: April
Year:: 2017
Address:: Valencia, Spain
Editors:: Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Venue:: BSNLP
SIG:: SIGSLAV
Publisher:: Association for Computational Linguistics
Note:
Pages:: 69–75
Language:
URL:: https://aclanthology.org/W17-1411/
DOI:: 10.18653/v1/W17-1411
Bibkey:
Cite (ACL):: Leon Rotim and Jan Šnajder. 2017. Comparison of Short-Text Sentiment Analysis Methods for Croatian. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 69–75, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):: Comparison of Short-Text Sentiment Analysis Methods for Croatian (Rotim & Šnajder, BSNLP 2017)
Copy Citation:
PDF:: https://aclanthology.org/W17-1411.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{rotim-snajder-2017-comparison,
    title = "Comparison of Short-Text Sentiment Analysis Methods for {C}roatian",
    author = "Rotim, Leon  and
      {\v{S}}najder, Jan",
    editor = "Erjavec, Toma{\v{z}}  and
      Piskorski, Jakub  and
      Pivovarova, Lidia  and
      {\v{S}}najder, Jan  and
      Steinberger, Josef  and
      Yangarber, Roman",
    booktitle = "Proceedings of the 6th Workshop on {B}alto-{S}lavic Natural Language Processing",
    month = apr,
    year = "2017",
    address = "Valencia, Spain",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W17-1411/",
    doi = "10.18653/v1/W17-1411",
    pages = "69--75",
    abstract = "We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="rotim-snajder-2017-comparison">
    <titleInfo>
        <title>Comparison of Short-Text Sentiment Analysis Methods for Croatian</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Leon</namePart>
        <namePart type="family">Rotim</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jan</namePart>
        <namePart type="family">Šnajder</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2017-04</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Tomaž</namePart>
            <namePart type="family">Erjavec</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jakub</namePart>
            <namePart type="family">Piskorski</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Lidia</namePart>
            <namePart type="family">Pivovarova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jan</namePart>
            <namePart type="family">Šnajder</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Josef</namePart>
            <namePart type="family">Steinberger</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Roman</namePart>
            <namePart type="family">Yangarber</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Valencia, Spain</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines.</abstract>
    <identifier type="citekey">rotim-snajder-2017-comparison</identifier>
    <identifier type="doi">10.18653/v1/W17-1411</identifier>
    <location>
        <url>https://aclanthology.org/W17-1411/</url>
    </location>
    <part>
        <date>2017-04</date>
        <extent unit="page">
            <start>69</start>
            <end>75</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Comparison of Short-Text Sentiment Analysis Methods for Croatian
%A Rotim, Leon
%A Šnajder, Jan
%Y Erjavec, Tomaž
%Y Piskorski, Jakub
%Y Pivovarova, Lidia
%Y Šnajder, Jan
%Y Steinberger, Josef
%Y Yangarber, Roman
%S Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
%D 2017
%8 April
%I Association for Computational Linguistics
%C Valencia, Spain
%F rotim-snajder-2017-comparison
%X We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus- and preprocessing-free string kernels, and how these compare to bag-of-words baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings outperform string kernels, which in turn outperform word and n-gram bag-of-words baselines.
%R 10.18653/v1/W17-1411
%U https://aclanthology.org/W17-1411/
%U https://doi.org/10.18653/v1/W17-1411
%P 69-75

Download as File

Markdown (Informal)

[Comparison of Short-Text Sentiment Analysis Methods for Croatian](https://aclanthology.org/W17-1411/) (Rotim & Šnajder, BSNLP 2017)

Comparison of Short-Text Sentiment Analysis Methods for Croatian (Rotim & Šnajder, BSNLP 2017)

ACL

Leon Rotim and Jan Šnajder. 2017. Comparison of Short-Text Sentiment Analysis Methods for Croatian. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 69–75, Valencia, Spain. Association for Computational Linguistics.