Ye Jiang

2025

pdf bib abs
QUST_NLP at SemEval-2025 Task 7: A Three-Stage Retrieval Framework for Monolingual and Crosslingual Fact-Checked Claim Retrieval
Youzheng Liu | Jiyan Liu | Xiaoman Xu | Taihang Wang | Yimin Wang | Ye Jiang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper describes the participation of team QUST_NLP in the SemEval-2025 Task 7. We propose a three-stage retrieval framework specifically designed for fact-checked claim retrieval. Initially, we evaluate the performance of several retrieval models and select the one that yields the best results for candidate retrieval. Next, we employ multiple re-ranking models to enhance the candidate results, with each model selecting the Top-10 outcomes. In the final stage, we utilize weighted voting to determine the final retrieval outcomes. Our approach achieved 5th place in the monolingual track and 7th place in the crosslingual track. We release our system code at: https://github.com/warmth27/SemEval2025_Task7.

pdf bib abs
Team QUST at SemEval-2025 Task 10: Evaluating Large Language Models in Multiclass Multi-label Classification of News Entity Framing
Jiyan Liu | Youzheng Liu | Taihang Wang | Xiaoman Xu | Yimin Wang | Ye Jiang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper introduces the participation of the QUST team in subtask 1 of SemEval-2025 Task 10. We evaluate various large language models (LLMs) based on instruction tuning (IT) on subtask 1. Specifically, we first analyze the data statistics, suggesting that the imbalance of label distribution made it difficult for LLMs to be fine-tuned. Subsequently, a voting mechanism is utilized on the predictions of the top-3 models to derive the final submission results. The team participated in all language tracks, achieving 1st place in Hindi (HI), 2nd in Russian (RU), 3rd in Portuguese (PT), 6th in Bulgarian (BG), and 7th in English (EN) on the official test set. We release our system code at: https://github.com/warmth27/SemEval2025_Task10

2024

pdf bib abs
Team QUST at SemEval-2024 Task 8: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting AI-generated Text
Xiaoman Xu | Xiangrun Li | Taihang Wang | Jianxiang Tian | Ye Jiang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper presents the participation of team QUST in Task 8 SemEval 2024. we first performed data augmentation and cleaning on the dataset to enhance model training efficiency and accuracy. In the monolingual task, we evaluated traditional deep-learning methods, multiscale positive-unlabeled framework (MPU), fine-tuning, adapters and ensemble methods. Then, we selected the top-performing models based on their accuracy from the monolingual models and evaluated them in subtasks A and B. The final model construction employed a stacking ensemble that combined fine-tuning with MPU. Our system achieved 6th (scored 6th in terms of accuracy, officially ranked 13th in order) place in the official test set in multilingual settings of subtask A. We release our system code at:https://github.com/warmth27/SemEval2024_QUST

2023

pdf bib abs
Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of the COVID-19 Infodemic
Ye Jiang | Xingyi Song | Carolina Scarton | Iknoor Singh | Ahmet Aker | Kalina Bontcheva
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

The spread of COVID-19 misinformation on social media became a major challenge for citizens, with negative real-life consequences. Prior research focused on detection and/or analysis of COVID-19 misinformation. However, fine-grained classification of misinformation claims has been largely overlooked. The novel contribution of this paper is in introducing a new dataset which makes fine-grained distinctions between statements that assert, comment or question on false COVID-19 claims. This new dataset not only enables social behaviour analysis but also enables us to address both evidence-based and non-evidence-based misinformation classification tasks. Lastly, through leave claim out cross-validation, we demonstrate that classifier performance on unseen COVID-19 misinformation claims is significantly different, as compared to performance on topics present in the training data.

pdf bib abs
Team QUST at SemEval-2023 Task 3: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting Online News Genre, Framing and Persuasion Techniques
Ye Jiang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the participation of team QUST in the SemEval2023 task3. The monolingual models are first evaluated with the under-sampling of the majority classes in the early stage of the task. Then, the pre-trained multilingual model is fine-tuned with a combination of the class weights and the sample weights. Two different fine-tuning strategies, the task-agnostic and the task-dependent, are further investigated. All experiments are conducted under the 10-fold cross-validation, the multilingual approaches are superior to the monolingual ones. The submitted system achieves the second best in Italian and Spanish (zero-shot) in subtask-1.

2019

pdf bib abs
Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network
Ye Jiang | Johann Petrak | Xingyi Song | Kalina Bontcheva | Diana Maynard
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the participation of team “bertha-von-suttner” in the SemEval2019 task 4 Hyperpartisan News Detection task. Our system uses sentence representations from averaged word embeddings generated from the pre-trained ELMo model with Convolutional Neural Networks and Batch Normalization for predicting hyperpartisan news. The final predictions were generated from the averaged predictions of an ensemble of models. With this architecture, our system ranked in first place, based on accuracy, the official scoring metric.

2017

pdf bib abs
Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation
Ye Jiang | Xingyi Song | Jackie Harrison | Shaun Quegan | Diana Maynard
Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism

News media typically present biased accounts of news stories, and different publications present different angles on the same event. In this research, we investigate how different publications differ in their approach to stories about climate change, by examining the sentiment and topics presented. To understand these attitudes, we find sentiment targets by combining Latent Dirichlet Allocation (LDA) with SentiWordNet, a general sentiment lexicon. Using LDA, we generate topics containing keywords which represent the sentiment targets, and then annotate the data using SentiWordNet before regrouping the articles based on topic similarity. Preliminary analysis identifies clearly different attitudes on the same issue presented in different news sources. Ongoing work is investigating how systematic these attitudes are between different publications, and how these may change over time.

Co-authors

Venues

Fix author