The Constant in HATE: Toxicity in Reddit across Topics and Languages

Wondimagegnhue Tsegaye Tufa; Ilia Markov; Piek T.J.M. Vossen

The Constant in HATE: Toxicity in Reddit across Topics and Languages

Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek T.J.M. Vossen

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages. By aligning languages with topics, we thoroughly analyze how toxicity spikes within different communities. Our analysis targets six languages spanning different communities and topics such as Culture, Politics, and News. We observe consistent patterns across languages where toxicity increases within the same topics while also identifying significant differences where specific language communities exhibit notable variations in relation to certain topics.

Anthology ID:: 2024.trac-1.1
Volume:: Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Bharathi Raja Chakravarthi, Bornini Lahiri, Siddharth Singh, Shyam Ratan
Venues:: TRAC | WS
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 1–11
Language:
URL:: https://aclanthology.org/2024.trac-1.1/
DOI:
Bibkey:
Cite (ACL):: Wondimagegnhue Tsegaye Tufa, Ilia Markov, and Piek T.J.M. Vossen. 2024. The Constant in HATE: Toxicity in Reddit across Topics and Languages. In Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024, pages 1–11, Torino, Italia. ELRA and ICCL.
Cite (Informal):: The Constant in HATE: Toxicity in Reddit across Topics and Languages (Tufa et al., TRAC 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.trac-1.1.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{tufa-etal-2024-constant,
    title = "The Constant in {HATE}: Toxicity in {R}eddit across Topics and Languages",
    author = "Tufa, Wondimagegnhue Tsegaye  and
      Markov, Ilia  and
      Vossen, Piek T.J.M.",
    editor = "Kumar, Ritesh  and
      Ojha, Atul Kr.  and
      Malmasi, Shervin  and
      Chakravarthi, Bharathi Raja  and
      Lahiri, Bornini  and
      Singh, Siddharth  and
      Ratan, Shyam",
    booktitle = "Proceedings of the Fourth Workshop on Threat, Aggression {\&} Cyberbullying @ LREC-COLING-2024",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.trac-1.1/",
    pages = "1--11",
    abstract = "Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages. By aligning languages with topics, we thoroughly analyze how toxicity spikes within different communities. Our analysis targets six languages spanning different communities and topics such as Culture, Politics, and News. We observe consistent patterns across languages where toxicity increases within the same topics while also identifying significant differences where specific language communities exhibit notable variations in relation to certain topics."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="tufa-etal-2024-constant">
    <titleInfo>
        <title>The Constant in HATE: Toxicity in Reddit across Topics and Languages</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Wondimagegnhue</namePart>
        <namePart type="given">Tsegaye</namePart>
        <namePart type="family">Tufa</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ilia</namePart>
        <namePart type="family">Markov</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Piek</namePart>
        <namePart type="given">T.J.M.</namePart>
        <namePart type="family">Vossen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2024-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Fourth Workshop on Threat, Aggression &amp; Cyberbullying @ LREC-COLING-2024</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Ritesh</namePart>
            <namePart type="family">Kumar</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Atul</namePart>
            <namePart type="given">Kr.</namePart>
            <namePart type="family">Ojha</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Shervin</namePart>
            <namePart type="family">Malmasi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Bharathi</namePart>
            <namePart type="given">Raja</namePart>
            <namePart type="family">Chakravarthi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Bornini</namePart>
            <namePart type="family">Lahiri</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Siddharth</namePart>
            <namePart type="family">Singh</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Shyam</namePart>
            <namePart type="family">Ratan</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>ELRA and ICCL</publisher>
            <place>
                <placeTerm type="text">Torino, Italia</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages. By aligning languages with topics, we thoroughly analyze how toxicity spikes within different communities. Our analysis targets six languages spanning different communities and topics such as Culture, Politics, and News. We observe consistent patterns across languages where toxicity increases within the same topics while also identifying significant differences where specific language communities exhibit notable variations in relation to certain topics.</abstract>
    <identifier type="citekey">tufa-etal-2024-constant</identifier>
    <location>
        <url>https://aclanthology.org/2024.trac-1.1/</url>
    </location>
    <part>
        <date>2024-05</date>
        <extent unit="page">
            <start>1</start>
            <end>11</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T The Constant in HATE: Toxicity in Reddit across Topics and Languages
%A Tufa, Wondimagegnhue Tsegaye
%A Markov, Ilia
%A Vossen, Piek T.J.M.
%Y Kumar, Ritesh
%Y Ojha, Atul Kr.
%Y Malmasi, Shervin
%Y Chakravarthi, Bharathi Raja
%Y Lahiri, Bornini
%Y Singh, Siddharth
%Y Ratan, Shyam
%S Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024
%D 2024
%8 May
%I ELRA and ICCL
%C Torino, Italia
%F tufa-etal-2024-constant
%X Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages. By aligning languages with topics, we thoroughly analyze how toxicity spikes within different communities. Our analysis targets six languages spanning different communities and topics such as Culture, Politics, and News. We observe consistent patterns across languages where toxicity increases within the same topics while also identifying significant differences where specific language communities exhibit notable variations in relation to certain topics.
%U https://aclanthology.org/2024.trac-1.1/
%P 1-11

Download as File

Markdown (Informal)

[The Constant in HATE: Toxicity in Reddit across Topics and Languages](https://aclanthology.org/2024.trac-1.1/) (Tufa et al., TRAC 2024)

The Constant in HATE: Toxicity in Reddit across Topics and Languages (Tufa et al., TRAC 2024)

ACL

Wondimagegnhue Tsegaye Tufa, Ilia Markov, and Piek T.J.M. Vossen. 2024. The Constant in HATE: Toxicity in Reddit across Topics and Languages. In Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024, pages 1–11, Torino, Italia. ELRA and ICCL.