NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil

Hariharan LekshmiAmmal, Manikandan Ravikiran, Anand Kumar Madasamy


Abstract
Toxic span identification in Tamil is a shared task that focuses on identifying harmful content, contributing to offensiveness. In this work, we have built a model that can efficiently identify the span of text contributing to offensive content. We have used various transformer-based models to develop the system, out of which the fine-tuned MuRIL model was able to achieve the best overall character F1-score of 0.4489.
Anthology ID:
2022.dravidianlangtech-1.12
Volume:
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Parameswari Krishnamurthy, Elizabeth Sherly, Sinnathamby Mahesan
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
75–78
Language:
URL:
https://aclanthology.org/2022.dravidianlangtech-1.12
DOI:
10.18653/v1/2022.dravidianlangtech-1.12
Bibkey:
Cite (ACL):
Hariharan LekshmiAmmal, Manikandan Ravikiran, and Anand Kumar Madasamy. 2022. NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 75–78, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
NITK-IT_NLP@TamilNLP-ACL2022: Transformer based model for Toxic Span Identification in Tamil (LekshmiAmmal et al., DravidianLangTech 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.dravidianlangtech-1.12.pdf
Video:
 https://aclanthology.org/2022.dravidianlangtech-1.12.mp4