2024
pdf
bib
abs
Transformers at HSD-2Lang 2024: Hate Speech Detection in Arabic and Turkish Tweets Using BERT Based Architectures
Kriti Singhal
|
Jatin Bedi
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
Over the past years, researchers across the globe have made significant efforts to develop systems capable of identifying the presence of hate speech in different languages. This paper describes the team Transformers’ submission to the subtasks: Hate Speech Detection in Turkish across Various Contexts and Hate Speech Detection with Limited Data in Arabic, organized by HSD-2Lang in conjunction with CASE at EACL 2024. A BERT based architecture was employed in both the subtasks. We achieved an F1 score of 0.63258 using XLM RoBERTa and 0.48101 using mBERT, hence securing the 6th rank and the 5th rank in the first and the second subtask, respectively.
pdf
bib
abs
Transformers@DravidianLangTech-EACL2024: Sentiment Analysis of Code-Mixed Tamil Using RoBERTa
Kriti Singhal
|
Jatin Bedi
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
In recent years, there has been a persistent focus on developing systems that can automatically identify the hate speech content circulating on diverse social media platforms. This paper describes the team Transformers’ submission to the Caste/Immigration Hate Speech Detection in Tamil shared task by LT-EDI 2024 workshop at EACL 2024. We used an ensemble approach in the shared task, combining various transformer-based pre-trained models using majority voting. The best macro average F1-score achieved was 0.82. We secured the 1st rank in the Caste/Immigration Hate Speech in Tamil shared task.
pdf
bib
abs
Transformers@LT-EDI-EACL2024: Caste and Migration Hate Speech Detection in Tamil Using Ensembling on Transformers
Kriti Singhal
|
Jatin Bedi
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
In recent years, there has been a persistent focus on developing systems that can automatically identify the hate speech content circulating on diverse social media platforms. This paper describes the team “Transformers” submission to the Caste and Migration Hate Speech Detection in Tamil shared task by LT-EDI 2024 workshop at EACL 2024. We used an ensemble approach in the shared task, combining various transformer-based pre-trained models using majority voting. The best macro average F1-score achieved was 0.82. We secured the 1st rank in the Caste and Migration Hate Speech in Tamil shared task.
pdf
bib
abs
Transformers at SemEval-2024 Task 5: Legal Argument Reasoning Task in Civil Procedure using RoBERTa
Kriti Singhal
|
Jatin Bedi
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Legal argument reasoning task in civil procedure is a new NLP task utilizing a dataset from the domain of the U.S. civil procedure. The task aims at identifying whether the solution to a question in the legal domain is correct or not. This paper describes the team “Transformers” submission to the Legal Argument Reasoning Task in Civil Procedure shared task at SemEval-2024 Task 5. We use a BERT-based architecture for the shared task. The highest F1-score score and accuracy achieved was 0.6172 and 0.6531 respectively. We secured the 13th rank in the Legal Argument Reasoning Task in Civil Procedure shared task.
pdf
bib
abs
Transformers at #SMM4H 2024: Identification of Tweets Reporting Children’s Medical Disorders And Effects of Outdoor Spaces on Social Anxiety Symptoms on Reddit Using RoBERTa
Kriti Singhal
|
Jatin Bedi
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
With the widespread increase in the use of social media platforms such as Twitter, Instagram, and Reddit, people are sharing their views on various topics. They have become more vocal on these platforms about their views and opinions on the medical challenges they are facing. This data is a valuable asset of medical insights in the study and research of healthcare. This paper describes our adoption of transformer-based approaches for tasks 3 and 5. For both tasks, we fine-tuned large RoBERTa, a BERT-based architecture, and achieved a highest F1 score of 0.413 and 0.900 in tasks 3 and 5, respectively.