ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation

Kristýna Onderková; Mateusz Lango; Patrícia Schmidtová; Ondřej Dušek

ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation

Kristýna Onderková, Mateusz Lango, Patrícia Schmidtová, Ondrej Dusek

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We describe a reproduction of a human annotation experiment that was performed to evaluate the effectiveness of text style transfer systems (Reif et al., 2021). Despite our efforts to closely imitate the conditions of the original study, the results obtained differ significantly from those in the original study. We performed a statistical analysis of the results obtained, discussed the sources of these discrepancies in the study design, and quantified reproducibility. The reproduction followed the common approach to reproduction adopted by the ReproHum project.

Anthology ID:: 2025.gem-1.55
Volume:: Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Month:: July
Year:: 2025
Address:: Vienna, Austria and virtual meeting
Editors:: Ofir Arviv, Miruna Clinciu, Kaustubh Dhole, Rotem Dror, Sebastian Gehrmann, Eliya Habba, Itay Itzhak, Simon Mille, Yotam Perlitz, Enrico Santus, João Sedoc, Michal Shmueli Scheuer, Gabriel Stanovsky, Oyvind Tafjord
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 601–608
Language:
URL:: https://aclanthology.org/2025.gem-1.55/
DOI:
Bibkey:
Cite (ACL):: Kristýna Onderková, Mateusz Lango, Patrícia Schmidtová, and Ondrej Dusek. 2025. ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 601–608, Vienna, Austria and virtual meeting. Association for Computational Linguistics.
Cite (Informal):: ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation (Onderková et al., GEM 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.gem-1.55.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{onderkova-etal-2025-reprohum,
    title = "{R}epro{H}um {\#}0669-08: Reproducing Sentiment Transfer Evaluation",
    author = "Onderkov{\'a}, Krist{\'y}na  and
      Lango, Mateusz  and
      Schmidtov{\'a}, Patr{\'i}cia  and
      Dusek, Ondrej",
    editor = "Arviv, Ofir  and
      Clinciu, Miruna  and
      Dhole, Kaustubh  and
      Dror, Rotem  and
      Gehrmann, Sebastian  and
      Habba, Eliya  and
      Itzhak, Itay  and
      Mille, Simon  and
      Perlitz, Yotam  and
      Santus, Enrico  and
      Sedoc, Jo{\~a}o  and
      Shmueli Scheuer, Michal  and
      Stanovsky, Gabriel  and
      Tafjord, Oyvind",
    booktitle = "Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM{\texttwosuperior})",
    month = jul,
    year = "2025",
    address = "Vienna, Austria and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.gem-1.55/",
    pages = "601--608",
    ISBN = "979-8-89176-261-9",
    abstract = "We describe a reproduction of a human annotation experiment that was performed to evaluate the effectiveness of text style transfer systems (Reif et al., 2021). Despite our efforts to closely imitate the conditions of the original study, the results obtained differ significantly from those in the original study. We performed a statistical analysis of the results obtained, discussed the sources of these discrepancies in the study design, and quantified reproducibility. The reproduction followed the common approach to reproduction adopted by the ReproHum project."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="onderkova-etal-2025-reprohum">
    <titleInfo>
        <title>ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Kristýna</namePart>
        <namePart type="family">Onderková</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Mateusz</namePart>
        <namePart type="family">Lango</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Patrícia</namePart>
        <namePart type="family">Schmidtová</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ondrej</namePart>
        <namePart type="family">Dusek</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Ofir</namePart>
            <namePart type="family">Arviv</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Miruna</namePart>
            <namePart type="family">Clinciu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kaustubh</namePart>
            <namePart type="family">Dhole</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Rotem</namePart>
            <namePart type="family">Dror</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sebastian</namePart>
            <namePart type="family">Gehrmann</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Eliya</namePart>
            <namePart type="family">Habba</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Itay</namePart>
            <namePart type="family">Itzhak</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Simon</namePart>
            <namePart type="family">Mille</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Yotam</namePart>
            <namePart type="family">Perlitz</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Enrico</namePart>
            <namePart type="family">Santus</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">João</namePart>
            <namePart type="family">Sedoc</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Michal</namePart>
            <namePart type="family">Shmueli Scheuer</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Gabriel</namePart>
            <namePart type="family">Stanovsky</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Oyvind</namePart>
            <namePart type="family">Tafjord</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Vienna, Austria and virtual meeting</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-261-9</identifier>
    </relatedItem>
    <abstract>We describe a reproduction of a human annotation experiment that was performed to evaluate the effectiveness of text style transfer systems (Reif et al., 2021). Despite our efforts to closely imitate the conditions of the original study, the results obtained differ significantly from those in the original study. We performed a statistical analysis of the results obtained, discussed the sources of these discrepancies in the study design, and quantified reproducibility. The reproduction followed the common approach to reproduction adopted by the ReproHum project.</abstract>
    <identifier type="citekey">onderkova-etal-2025-reprohum</identifier>
    <location>
        <url>https://aclanthology.org/2025.gem-1.55/</url>
    </location>
    <part>
        <date>2025-07</date>
        <extent unit="page">
            <start>601</start>
            <end>608</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation
%A Onderková, Kristýna
%A Lango, Mateusz
%A Schmidtová, Patrícia
%A Dusek, Ondrej
%Y Arviv, Ofir
%Y Clinciu, Miruna
%Y Dhole, Kaustubh
%Y Dror, Rotem
%Y Gehrmann, Sebastian
%Y Habba, Eliya
%Y Itzhak, Itay
%Y Mille, Simon
%Y Perlitz, Yotam
%Y Santus, Enrico
%Y Sedoc, João
%Y Shmueli Scheuer, Michal
%Y Stanovsky, Gabriel
%Y Tafjord, Oyvind
%S Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
%D 2025
%8 July
%I Association for Computational Linguistics
%C Vienna, Austria and virtual meeting
%@ 979-8-89176-261-9
%F onderkova-etal-2025-reprohum
%X We describe a reproduction of a human annotation experiment that was performed to evaluate the effectiveness of text style transfer systems (Reif et al., 2021). Despite our efforts to closely imitate the conditions of the original study, the results obtained differ significantly from those in the original study. We performed a statistical analysis of the results obtained, discussed the sources of these discrepancies in the study design, and quantified reproducibility. The reproduction followed the common approach to reproduction adopted by the ReproHum project.
%U https://aclanthology.org/2025.gem-1.55/
%P 601-608

Download as File

Markdown (Informal)

[ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation](https://aclanthology.org/2025.gem-1.55/) (Onderková et al., GEM 2025)

ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation (Onderková et al., GEM 2025)

ACL

Kristýna Onderková, Mateusz Lango, Patrícia Schmidtová, and Ondrej Dusek. 2025. ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 601–608, Vienna, Austria and virtual meeting. Association for Computational Linguistics.