FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis

Gaurish Thakkar; Marko Tadić; Nives Mikelić Preradović

FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis

Gaurish Thakkar, Marko Tadić, Nives Mikelic Preradovic

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper describes our system used for a shared task on code-mixed, less-resourced sentiment analysis for Indo-Aryan languages. We are using the large language models (LLMs) since they have demonstrated excellent performance on classification tasks. In our participation in all tracks, we use unsloth/mistral-7b-bnb-4bit LLM for the task of code-mixed sentiment analysis. For track 1, we used a simple fine-tuning strategy on PLMs by combining data from multiple phases. Our trained systems secured first place in four phases out of five. In addition, we present the results achieved using several PLMs for each language.

Anthology ID:: 2024.wildre-1.9
Volume:: Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Girish Nath Jha, Sobha L., Kalika Bali, Atul Kr. Ojha
Venues:: WILDRE | WS
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 59–65
Language:
URL:: https://aclanthology.org/2024.wildre-1.9/
DOI:
Bibkey:
Cite (ACL):: Gaurish Thakkar, Marko Tadić, and Nives Mikelic Preradovic. 2024. FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis. In Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation, pages 59–65, Torino, Italia. ELRA and ICCL.
Cite (Informal):: FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis (Thakkar et al., WILDRE 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.wildre-1.9.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{thakkar-etal-2024-fzzg,
    title = "{FZZG} at {WILDRE}-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis",
    author = "Thakkar, Gaurish  and
      Tadi{\'c}, Marko  and
      Mikelic Preradovic, Nives",
    editor = "Jha, Girish Nath  and
      L., Sobha  and
      Bali, Kalika  and
      Ojha, Atul Kr.",
    booktitle = "Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.wildre-1.9/",
    pages = "59--65",
    abstract = "This paper describes our system used for a shared task on code-mixed, less-resourced sentiment analysis for Indo-Aryan languages. We are using the large language models (LLMs) since they have demonstrated excellent performance on classification tasks. In our participation in all tracks, we use \textit{unsloth/mistral-7b-bnb-4bit} LLM for the task of code-mixed sentiment analysis. For track 1, we used a simple fine-tuning strategy on PLMs by combining data from multiple phases. Our trained systems secured first place in four phases out of five. In addition, we present the results achieved using several PLMs for each language."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="thakkar-etal-2024-fzzg">
    <titleInfo>
        <title>FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Gaurish</namePart>
        <namePart type="family">Thakkar</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Marko</namePart>
        <namePart type="family">Tadić</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nives</namePart>
        <namePart type="family">Mikelic Preradovic</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2024-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Girish</namePart>
            <namePart type="given">Nath</namePart>
            <namePart type="family">Jha</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sobha</namePart>
            <namePart type="family">L.</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kalika</namePart>
            <namePart type="family">Bali</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Atul</namePart>
            <namePart type="given">Kr.</namePart>
            <namePart type="family">Ojha</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>ELRA and ICCL</publisher>
            <place>
                <placeTerm type="text">Torino, Italia</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper describes our system used for a shared task on code-mixed, less-resourced sentiment analysis for Indo-Aryan languages. We are using the large language models (LLMs) since they have demonstrated excellent performance on classification tasks. In our participation in all tracks, we use unsloth/mistral-7b-bnb-4bit LLM for the task of code-mixed sentiment analysis. For track 1, we used a simple fine-tuning strategy on PLMs by combining data from multiple phases. Our trained systems secured first place in four phases out of five. In addition, we present the results achieved using several PLMs for each language.</abstract>
    <identifier type="citekey">thakkar-etal-2024-fzzg</identifier>
    <location>
        <url>https://aclanthology.org/2024.wildre-1.9/</url>
    </location>
    <part>
        <date>2024-05</date>
        <extent unit="page">
            <start>59</start>
            <end>65</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis
%A Thakkar, Gaurish
%A Tadić, Marko
%A Mikelic Preradovic, Nives
%Y Jha, Girish Nath
%Y L., Sobha
%Y Bali, Kalika
%Y Ojha, Atul Kr.
%S Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation
%D 2024
%8 May
%I ELRA and ICCL
%C Torino, Italia
%F thakkar-etal-2024-fzzg
%X This paper describes our system used for a shared task on code-mixed, less-resourced sentiment analysis for Indo-Aryan languages. We are using the large language models (LLMs) since they have demonstrated excellent performance on classification tasks. In our participation in all tracks, we use unsloth/mistral-7b-bnb-4bit LLM for the task of code-mixed sentiment analysis. For track 1, we used a simple fine-tuning strategy on PLMs by combining data from multiple phases. Our trained systems secured first place in four phases out of five. In addition, we present the results achieved using several PLMs for each language.
%U https://aclanthology.org/2024.wildre-1.9/
%P 59-65

Download as File

Markdown (Informal)

[FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis](https://aclanthology.org/2024.wildre-1.9/) (Thakkar et al., WILDRE 2024)

FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis (Thakkar et al., WILDRE 2024)

ACL

Gaurish Thakkar, Marko Tadić, and Nives Mikelic Preradovic. 2024. FZZG at WILDRE-7: Fine-tuning Pre-trained Models for Code-mixed, Less-resourced Sentiment Analysis. In Proceedings of the 7th Workshop on Indian Language Data: Resources and Evaluation, pages 59–65, Torino, Italia. ELRA and ICCL.