Xuejie Zhang

2025

YNU-HPCC at SemEval-2025 Task 1: Enhancing Multimodal Idiomaticity Representation via LoRA and Hybrid Loss Optimization
Liu Lei | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This study reports the YNU-HPCC team’s participation in Subtask A of SemEval-2025 Task 1 on multimodal idiomatic representation. The task requires ranking candidate images based on their semantic relevance to a target idiom within a given sentence, challenging models to disambiguate idiomatic semantics, and aligning them with abstract visual concepts across English and Portuguese. Using AltCLIP-m18 as the base model, our approach enhances its zero-shot capabilities with LoRA fine-tuning and combines ListMLE ranking optimization with Focal Loss to handle hard samples. Experimental results on the primary test set show significant improvements over the base model, with Top-1 Accuracy/DCG scores of 0.53/2.94 for English and 0.77/3.31 for Portuguese. The code is publicly available at https://github.com/1579364808/Semeval_2025_task1.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 8: Enhancing Question-Answering over Tabular Data with TableGPT2
Kaiwen Hu | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper describes our systems for SemEval 2025 Task8, Question Answering over Tabular Data. This task encourages us to develop a system that answers questions of the kind present in DataBench over day-to-day datasets, where the answer is either a number, a categorical value, a boolean value, or lists of several types. Participating in Task 8, we engage in all subtasks. The challenge lies in the multi-step reasoning process of converting natural language queries into executable code. This challenge is exacerbated by the limitations of current methods, such as chaining reasoning, which have difficulty handling complex multi-step reasoning paths due to difficulty evaluating intermediate steps. In the official ranking, we obtain a score of 65.64. On the final competition test set, our DataBench accuracy is 65.64%, and DataBench Lite accuracy is 66.62%. Both exceed the baseline (26%). The competitive results in two subtasks demonstrate the effectiveness of our systems.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 10: A Two-Stage Approach to Solving Multi-Label and Multi-Class Role Classification Based on DeBERTa
Ning Li | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

A two-stage role classification model based on DeBERTa is proposed for the Entity Framework task in SemEval 2025 Task 10. The task is confronted with challenges such as multi-labeling, multi-category, and category imbalance, particularly in the semantic overlap and data sparsity of fine-grained roles. Existing methods primarily rely on rules, traditional machine learning, or deep learning, but the accurate classification of fine-grained roles is difficult to achieve. To address this, the proposed model integrates the deep semantic representation of the DeBERTa pre-trained language model through two sub-models: main role classification and sub-role classification, and utilizes Focal Loss to optimize the category imbalance issue. Experimental results indicate that the model achieves an accuracy of 75.32% in predicting the main role, while the exact matching rate for the sub-role is 8.94%. This is mainly limited by the strict matching standard and semantic overlap of fine-grained roles in the multi-label task. Compared to the baseline’s sub-role exact matching rate of 3.83%, the proposed model significantly improves this metric. The model ultimately ranked 23rd on the leaderboard. The code of this paper is available at:https://github.com/jiyuaner/YNU-HPCC-at-SemEval-2025-Task10.

pdf bib abs

Topology-of-Question-Decomposition: Enhancing Large Language Models with Information Retrieval for Knowledge-Intensive Tasks
Weijie Li | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 31st International Conference on Computational Linguistics

Large language models (LLMs) are increasingly deployed for general problem-solving across various domains yet remain constrained to chaining immediate reasoning steps and depending solely on parametric knowledge. Integrating an information retrieval system directly into the reasoning process of LLMs can improve answer accuracy but might disrupt the natural reasoning sequence. Consequently, LLMs may underperform in complex, knowledge-intensive tasks requiring multiple reasoning steps, extensive real-world knowledge, or critical initial decisions. To overcome these challenges, we introduce a novel framework, Topology-of-Question-Decomposition (ToQD), which activates retrieval only when necessary. Globally, ToQD guides LLMs in constructing a topology graph from the input question, each node representing a sub-question. Locally, ToQD employs self-verify inference to determine whether a sub-question should retrieve relevant documents, necessitate further decomposition, or directly provide an answer. Experiments demonstrate that ToQD achieves superior performance and robustness in complex, knowledge-intensive tasks, significantly enhancing system response efficiency.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Using Multiple Prediction Headers
Hao Yang | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper describes the our team’s participation in Subtask A of Task 11 at SemEval-2025, focusing on multilingual text-based emotion classification. The team employed the RoBERTa model, enhanced with modifications to the output head to allow independent prediction of six emotions: anger, disgust, fear, joy, sadness, and surprise. The dataset was translated into English using Google Translate to facilitate processing. The study found that a single prediction head outperformed simultaneous prediction of multiple emotions, and training on the translated dataset yielded better results than using the original dataset. The team incorporated Focal Loss and R-Drop techniques to address class imbalance and improve model stability. Future work will continue to explore improvements in this area.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 6: Using BERT Model with R-drop for Promise Verification
Dehui Deng | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper presents our participation in the SemEval-2025 task 6: multinational, multilingual, multi-industry promise verification. The SemEval-2025 Task 6 aims to extract Promise Identification, Supporting Evidence, Clarity of the Promise-Evidence Pair, and Timing for Verification from the commitments made to businesses and governments. Use these data to verify whether companies and governments have fulfilled their commitments. In this task, we participated in the English task, whichincluded analysis of numbers in the text, reading comprehension of the text content and multi-label classification. Our model introduces regularization dropout based on Bert-base to focus on the stability of non-target classes, improve the robustness of the model, and ultimately improve the indicators. Our approach obtained competitive results in subtasks.

pdf bib abs

Learning to Reason via Self-Iterative Process Feedback for Small Language Models
Kaiyuan Chen | Jin Wang | Xuejie Zhang
Proceedings of the 31st International Conference on Computational Linguistics

Small language models (SLMs) are more efficient, cost-effective, and customizable than large language models (LLMs), though they often underperform in specific areas like reasoning. Past methods for enhancing SLMs’ reasoning, such as supervised fine-tuning and distillation, often depend on costly external signals, resulting in SLMs being overly confident with limited supervision signals, thus limiting their abilities. Therefore, this study enables SLMs to learn to reason from self-iterative feedback. By combining odds ratio preference optimization (ORPO), we fine-tune and align SLMs using positive and negative signals generated by themselves. Additionally, we introduce process supervision for rewards in preference alignment by sampling-based inference simulation and process reward models. Compared to Supervised Fine-Tuning (SFT), our method improves the performance of Gemma-2B by 12.43 (Acc) on GSM8K and 3.95 (Pass@1) on MBPP. Furthermore, the proposed method also demonstrated superior out-of-domain generalization capabilities on MMLU_Math and HumanEval.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 5: Contrastive Learning for GND Subject Tagging with Multilingual Sentence-BERT
Hong Jiang | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper describes YNU-HPCC(Alias JH) team’s participation in the sub-task 2 of the SemEval-2025 Task 5, which requires fine-tuning language models to align subject tags with the TIBKAT collection. The task presents three key challenges: cross-disciplinary document coverage, bilingual (English-German) processing requirements, and extreme classification over 200,000 GND Subjects. To address these challenges, we apply a contrastive learning framework using multilingual Sentence-BERT models, implementing two innovative training strategies: mixed-negative multi-label sampling, and single-label sampling with random negative selection. Our best-performing model achieves significant improvements of 28.6% in average recall, reaching 0.2252 on the core-test set and 0.1677 on the all-test set. Notably, we reveal model architecture-dependent response patterns: MiniLM-series models benefit from multi-label training (+33.5% zero-shot recall), while mpnet variants excel with single-label approaches (+230.3% zero-shot recall). The study further demonstrates the effectiveness of contrastive learning for multilingual semantic alignment in low-resource scenarios, providing insights for extreme classification tasks.

pdf bib

YNU-HPCC at SemEval-2025 Task 7: Multilingual and Cross-lingual Fact-checked Claim Retrieval
Yuheng Mao | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

pdf bib abs

YNU-HPCC at SemEval-2025 Task3: Leveraging Zero-Shot Learning for Halluciantion Detection
Shen Chen | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This study reports the YNU-HPCC team’s participation in SemEval-2025 shared task 3, which focuses on detecting hallucination spans in multilingual instruction-tuned LLM outputs. This task differs from typical hallucination detection tasks in that it does not require identifying the entire response or pinpointing which sentences contain hallucinations generated by the LLM. Instead, the task focuses on detecting hallucinations at the character level. In addition, this task differs from typical hallucination detection based on binary classification. It requires not only identifying hallucinations but also assigning a likelihood score to indicate how likely each part of the model output is hallucinatory. Our approach combines Retrieval-Augmented Generation (RAG) and zero-shot methods, guiding LLMs to detect and extract hallucination spans using external knowledge. The proposed system achieved first place in Chinese and fifteenth place in English for track3.

pdf bib abs

Reasoning with Trees: Faithful Question Answering over Knowledge Graph
Tiesunlong Shen | Jin Wang | Xuejie Zhang | Erik Cambria
Proceedings of the 31st International Conference on Computational Linguistics

Recent advancements in large language models (LLMs) have shown remarkable progress in reasoning capabilities, yet they still face challenges in complex, multi-step reasoning tasks. This study introduces Reasoning with Trees (RwT), a novel framework that synergistically integrates LLMs with knowledge graphs (KGs) to enhance reasoning performance and interpretability. RwT reformulates knowledge graph question answering (KGQA) as a discrete decision-making problem, leveraging Monte Carlo Tree Search (MCTS) to iteratively refine reasoning paths. This approach mirrors human-like reasoning by dynamically integrating the LLM’s internal knowledge with external KG information. We propose a real-data guided iteration technique to train an evaluation model that assesses action values, improving the efficiency of the MCTS process. Experimental results on two benchmark KGQA datasets demonstrate that RwT significantly outperforms existing state-of-the-art methods, with an average performance improvement of 9.81%. Notably, RwT achieves these improvements without requiring complete retraining of the LLM, offering a more efficient and adaptable approach to enhancing LLM reasoning capabilities.

pdf bib abs

YNU-HPCC at SemEval-2025 Task 2: Local Cache and Online Retrieval-Based method for Entity-Aware Machine Translation
Hao Li | Jin Wang | Xuejie Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper presents methods for {textbf{SemEval-2025 Task 11}} on text-based emotion detection across three tracks: Multi-label Emotion Detection, Emotion Intensity Prediction, and Cross-lingual Emotion Detection. We apply approaches such as supervised fine-tuning, preference-based reinforcement learning, and few-shot learning to enhance performance. Our combined strategies result in improved accuracy, particularly in multi-label and cross-lingual emotion detection, demonstrating the effectiveness of these methods in diverse linguistic settings.

2024

pdf bib abs

YNU-HPCC at SemEval-2024 Task10: Pre-trained Language Model for Emotion Discovery and Reasoning its Flip in Conversation
Chenyi Liang | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes the application of fine-tuning pre-trained models for SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF), which requires the prediction of emotions for each utterance in a conversation and the identification of sentences where an emotional flip occurs. This model is built on the DeBERTa transformer model and enhanced for emotion detection and flip reasoning in conversations. It employs specific separators for utterance processing and utilizes specific padding to handle variable-length inputs. Methods such as R-drop, back translation, and focalloss are also employed in the training of my model. The model achieved specific results on the competition’s official leaderboard. The code of this paper is available athttps://github.com/jiaowoobjiuhao/SemEval-2024-task10.

pdf bib abs

YNU-HPCC at SemEval-2024 Task 9: Using Pre-trained Language Models with LoRA for Multiple-choice Answering Tasks
Jie Wang | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This study describes the model built in Task 9: brainteaser in the SemEval-2024 competition, which is a multiple-choice task. As active participants in Task 9, our system strategically employs the decoding-enhanced BERT (DeBERTa) architecture enriched with disentangled attention mechanisms. Additionally, we fine-tuned our model using low-rank adaptation (LoRA) to optimize its performance further. Moreover, we integrate focal loss into our framework to address label imbalance issues. The systematic integration of these techniques has resulted in outstanding performance metrics. Upon evaluation using the provided test dataset, our system showcases commendable results, with a remarkable accuracy score of 0.9 for subtask 1, positioning us fifth among all participants. Similarly, for subtask 2, our system exhibits a substantial accuracy rate of 0.781, securing a commendable seventh-place ranking. The code for this paper is published at: https://github.com/123yunnandaxue/Semveal-2024_task9.

pdf bib abs

Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation
Xiang Luo | Zhiwen Tang | Jin Wang | Xuejie Zhang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Zero-shot dialogue state tracking (DST) seeks to enable dialogue systems to transition to unfamiliar domains without manual annotation or extensive retraining. Prior research has approached this objective by embedding prompts into language models (LMs). Common methodologies include integrating prompts at the input layer or introducing learnable variables at each transformer layer. Nonetheless, each strategy exhibits inherent limitations. Prompts integrated at the input layer risk underutilization, with their impact potentially diminishing across successive transformer layers. Conversely, the addition of learnable variables to each layer can complicate the training process and increase inference latency. To tackle the issues mentioned above, this paper proposes Dual Low-Rank Adaptation (DualLoRA), a plug-and-play architecture designed for zero-shot DST. DualLoRA incorporates two distinct Low-Rank Adaptation (LoRA) components, targeting both dialogue context processing and prompt optimization, to ensure the comprehensive influence of prompts throughout the transformer model layers. This is achieved without incurring additional inference latency, showcasing an efficient integration into existing architectures. Through rigorous evaluation on the MultiWOZ and SGD datasets, DualLoRA demonstrates notable improvements across multiple domains, outperforming traditional baseline methods in zero-shot settings.

pdf bib abs

Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis
Guangmin Zheng | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Findings of the Association for Computational Linguistics: ACL 2024

Aspect-based sentiment analysis (ABSA) identifies sentiment information related to specific aspects and provides deeper market insights to businesses and organizations. With the emergence of large language models (LMs), recent studies have proposed using fixed examples for instruction tuning to reformulate ABSA as a generation task. However, the performance is sensitive to the selection of in-context examples; several retrieval methods are based on surface similarity and are independent of the LM generative objective. This study proposes an instruction learning method with retrieval-based example ranking for ABSA tasks. For each target sample, an LM was applied as a scorer to estimate the likelihood of the output given the input and a candidate example as the prompt, and training examples were labeled as positive or negative by ranking the scores. An alternating training schema is proposed to train both the retriever and LM. Instructional prompts can be constructed using high-quality examples. The LM is used for both scoring and inference, improving the generation efficiency without incurring additional computational costs or training difficulties. Extensive experiments on three ABSA subtasks verified the effectiveness of the proposed method, demonstrating its superiority over various strong baseline models. Code and data are released at https://github.com/zgMin/IT-RER-ABSA.

pdf bib abs

DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues
Xiang Luo | Zhiwen Tang | Jin Wang | Xuejie Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

User Simulators play a pivotal role in training and evaluating task-oriented dialogue systems. Traditional user simulators typically rely on human-engineered agendas, resulting in generated responses that often lack diversity and spontaneity. Although large language models (LLMs) exhibit a remarkable capacity for generating coherent and contextually appropriate utterances, they may fall short when tasked with generating responses that effectively guide users towards their goals, particularly in dialogues with intricate constraints and requirements. This paper introduces DuetSim, a novel framework designed to address the intricate demands of task-oriented dialogues by leveraging LLMs. DuetSim stands apart from conventional approaches by employing two LLMs in tandem: one dedicated to response generation and the other focused on verification. This dual LLM approach empowers DuetSim to produce responses that not only exhibit diversity but also demonstrate accuracy and are preferred by human users. We validate the efficacy of our method through extensive experiments conducted on the MultiWOZ dataset, highlighting improvements in response quality and correctness, largely attributed to the incorporation of the second LLM.

pdf bib abs

Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling
Guangmin Zheng | Jin Wang | Xiaobing Zhou | Xuejie Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Chain of thought (CoT) has proven useful for problems requiring complex reasoning. Many of these problems are both textual and multimodal. Given the inputs in different modalities, a model generates a rationale and then uses it to answer a question. Because of the hallucination issue, the generated soft negative rationales with high textual quality but illogical semantics do not always help improve answer accuracy. This study proposes a rationale generation method using soft negative sampling (SNSE-CoT) to mitigate hallucinations in multimodal CoT. Five methods were applied to generate soft negative samples that shared highly similar text but had different semantics from the original. Bidirectional margin loss (BML) was applied to introduce them into the traditional contrastive learning framework that involves only positive and negative samples. Extensive experiments on the ScienceQA dataset demonstrated the effectiveness of the proposed method. Code and data are released at https://github.com/zgMin/SNSE-CoT.

pdf bib abs

SoftMCL: Soft Momentum Contrastive Learning for Fine-grained Sentiment-aware Pre-training
Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The pre-training for language models captures general language understanding but fails to distinguish the affective impact of a particular context to a specific word. Recent works have sought to introduce contrastive learning (CL) for sentiment-aware pre-training in acquiring affective information. Nevertheless, these methods present two significant limitations. First, the compatibility of the GPU memory often limits the number of negative samples, hindering the opportunities to learn good representations. In addition, using only a few sentiment polarities as hard labels, e.g., positive, neutral, and negative, to supervise CL will force all representations to converge to a few points, leading to the issue of latent space collapse. This study proposes a soft momentum contrastive learning (SoftMCL) for fine-grained sentiment-aware pre-training. Instead of hard labels, we introduce valence ratings as soft-label supervision for CL to fine-grained measure the sentiment similarities between samples. The proposed SoftMCL conducts CL on both the word- and sentence-level to enhance the model’s ability to learn affective information. A momentum queue was introduced to expand the contrastive samples, allowing storing and involving more negatives to overcome the limitations of hardware platforms. Extensive experiments were conducted on four different sentiment-related tasks, which demonstrates the effectiveness of the proposed SoftMCL method. The code and data of the proposed SoftMCL is available at: https://www.github.com/wangjin0818/SoftMCL/.

pdf bib abs

YNU-HPCC at SemEval-2024 Task 1: Self-Instruction Learning with Black-box Optimization for Semantic Textual Relatedness
Weijie Li | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper introduces a system designed for SemEval-2024 Task 1 that focuses on assessing Semantic Textual Relatedness (STR) between sentence pairs, including its multilingual version. STR, which evaluates the coherence of sentences, is distinct from Semantic Textual Similarity (STS). However, Large Language Models (LLMs) such as ERNIE-Bot-turbo, typically trained on STS data, often struggle to differentiate between the two concepts. To address this, we developed a self-instruction method that enhances their performance distinguishing STR, particularly in cases with high STS but low STR. Beginning with a task description, the system generates new task instructions refined through human feedback. It then iteratively enhances these instructions by comparing them to the original and evaluating the differences. Utilizing the Large Language Models’ (LLMs) natural language comprehension abilities, the system aims to produce progressively optimized instructions based on the resulting scores. Through our optimized instructions, ERNIE-Bot-turbo exceeds the performance of conventional models,achieving a score enhancement of 4 to 7% on multilingual development datasets.

pdf bib abs

YNU-HPCC at SemEval-2024 Task 5: Regularized Legal-BERT for Legal Argument Reasoning Task in Civil Procedure
Peng Shi | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes the submission of team YNU-HPCC to SemEval-2024 for Task 5: The Legal Argument Reasoning Task in Civil Procedure. The task asks candidates the topic, questions, and answers, classifying whether a given candidate’s answer is correct (True) or incorrect (False). To make a sound judgment, we propose a system. This system is based on fine-tuning the Legal-BERT model that specializes in solving legal problems. Meanwhile,Regularized Dropout (R-Drop) and focal Loss were used in the model. R-Drop is used for data augmentation, and focal loss addresses data imbalances. Our system achieved relatively good results on the competition’s official leaderboard. The code of this paper is available at https://github.com/YNU-PengShi/SemEval-2024-Task5.

pdf bib abs

YNU-HPCC at SemEval-2024 Task 2: Applying DeBERTa-v3-large to Safe Biomedical Natural Language Inference for Clinical Trials
Rengui Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes the system for the YNU-HPCC team for SemEval2024 Task 2, focusing on Safe Biomedical Natural Language Inference for Clinical Trials. The core challenge of this task lies in discerning the textual entailment relationship between Clinical Trial Reports (CTR) and statements annotated by expert annotators, including the necessity to infer the relationships in texts subjected to semantic interventions accurately. Our approach leverages a fine-tuned DeBERTa-v3-large model augmented with supervised contrastive learning and back-translation techniques. Supervised contrastive learning aims to bolster classification ac-curacy while back-translation enriches the diversity and quality of our training corpus. Our method achieves a decent F1 score. However, the results also indicate a need for further en-hancements in the system’s capacity for deep semantic comprehension, highlighting areas for future refinement. The code of this paper is available at:https://github.com/RGTnuw/RG_YNU-HPCC-at-Semeval2024-Task2.

pdf bib abs

YNU-HPCC at SIGHAN-2024 dimABSA Task: Using PLMs with a Joint Learning Strategy for Dimensional Intensity Prediction
Zehui Wang | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)

The dimensional approach can represent more fine-grained emotional information than discrete affective states. In this paper, a pretrained language model (PLM) with a joint learning strategy is proposed for the SIGHAN-2024 shared task on Chinese dimensional aspect-based sentiment analysis (dimABSA), which requires submitted models to provide fine-grained multi-dimensional (Valance and Arousal) intensity predictions for given aspects of a review. The proposed model consists of three parts: an input layer that concatenates both given aspect terms and input sentences; a Chinese PLM encoder that generates aspect-specific review representation; and separate linear predictors that jointly predict Valence and Arousal sentiment intensities. Moreover, we merge simplified and traditional Chinese training data for data augmentation. Our systems ranked 2nd place out of 5 participants in subtask 1-intensity prediction. The code is publicly available at https://github.com/WZH5127/2024_subtask1_intensity_prediction.

pdf bib abs

YNU-HPCC at SemEval-2024 Task 7: Instruction Fine-tuning Models for Numerical Understanding and Generation
Kaiyuan Chen | Jin Wang | Xuejie Zhang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper presents our systems for Task 7, Numeral-Aware Language Understanding and Generation of SemEval 2024. As participants of Task 7, we engage in all subtasks and implement corresponding systems for each subtask. All subtasks cover three aspects: Quantitative understanding (English), Reading Comprehension of the Numbers in the text (Chinese), and Numeral-Aware Headline Generation (English). Our approach explores employing instruction-tuned models (Flan-T5) or text-to-text models (T5) to accomplish the respective subtasks. We implement the instruction fine-tuning with or without demonstrations and employ similarity-based retrieval or manual methods to construct demonstrations for each example in instruction fine-tuning. Moreover, we reformulate the model’s output into a chain-of-thought format with calculation expressions to enhance its reasoning performance for reasoning subtasks. The competitive results in all subtasks demonstrate the effectiveness of our systems.

pdf bib abs

Improving Personalized Sentiment Representation with Knowledge-enhanced and Parameter-efficient Layer Normalization
You Zhang | Jin Wang | Liang-Chih Yu | Dan Xu | Xuejie Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Existing studies on personalized sentiment classification consider a document review as an overall text unit and incorporate backgrounds (i.e., user and product information) to learn sentiment representation. However, it is difficult when these methods meet the current pretrained language models (PLMs) owing to quadratic costs that increase with text length and heterogeneous mixes of randomly initialized background information and textual information initialized from well-pretrained checkpoints during information incorporation. To address these problems, we propose a knowledge-enhanced and parameter-efficient layer normalization (E2LN) for efficient and effective review modeling via leveraging LN in transformer structures. Initially, a knowledge base is introduced that stores well-pretrained checkpoints, structured text information, and background information. Based on such a knowledge base, the ability of LN can be magnified as being a crucial component of transformer structure and then improve the performance of PLMs in downstream tasks. Moreover, the proposed E2LN can make PLMs capable of modeling long document reviews and incorporating background information with parameter-efficient fine-tuning and knowledge injecting. Extensive experimental results were obtained for three document-level sentiment classification benchmark datasets. By comparing the results, the effectiveness and efficiency of the proposed model was demonstrated. Code and Data are released at https://github.com/yoyo-yun/E2LN.

2023

pdf bib

YNU-ISE-ZXW at ROCLING 2023 MultiNER-Health Task: A Transformer-based Model with LoRA for Chinese Healthcare Named Entity Recognition
Xingwei Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

pdf bib abs

Domain Generalization via Switch Knowledge Distillation for Robust Review Representation
You Zhang | Jin Wang | Liang-Chih Yu | Dan Xu | Xuejie Zhang
Findings of the Association for Computational Linguistics: ACL 2023

Applying neural models injected with in-domain user and product information to learn review representations of unseen or anonymous users incurs an obvious obstacle in content-based recommender systems. For the generalization of the in-domain classifier, most existing models train an extra plain-text model for the unseen domain. Without incorporating historical user and product information, such a schema makes unseen and anonymous users dissociate from the recommender system. To simultaneously learn the review representation of both existing and unseen users, this study proposed a switch knowledge distillation for domain generalization. A generalization-switch (GSwitch) model was initially applied to inject user and product information by flexibly encoding both domain-invariant and domain-specific features. By turning the status ON or OFF, the model introduced a switch knowledge distillation to learn a robust review representation that performed well for either existing or anonymous unseen users. The empirical experiments were conducted on IMDB, Yelp-2013, and Yelp-2014 by masking out users in test data as unseen and anonymous users. The comparative results indicate that the proposed method enhances the generalization capability of several existing baseline models. For reproducibility, the code for this paper is available at: https://github.com/yoyo-yun/DG_RRR.

pdf bib abs

YNU-HPCC at SemEval-2023 Task7: Multi-evidence Natural Language Inference for Clinical Trial Data Based a BioBERT Model
Chao Feng | Jin Wang | Xuejie Zhang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the system for the YNU-HPCC team in subtask 1 of the SemEval-2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT). This task requires judging the textual entailment relationship between the given CTR and the statement annotated by the expert annotator. This system is based on the fine-tuned Bi-directional Encoder Representation from Transformers for Biomedical Text Mining (BioBERT) model with supervised contrastive learning and back translation. Supervised contrastive learning is to enhance the classification, and back translation is to enhance the training data. Our system achieved relatively good results on the competition’s official leaderboard. The code of this paper is available at https://github.com/facanhe/SemEval-2023-Task7.

pdf bib abs

YNU-HPCC at WASSA 2023: Using Text-Mixed Data Augmentation for Emotion Classification on Code-Mixed Text Message
Xuqiao Ran | You Zhang | Jin Wang | Dan Xu | Xuejie Zhang
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Emotion classification on code-mixed texts has been widely used in real-world applications. In this paper, we build a system that participates in the WASSA 2023 Shared Task 2 for emotion classification on code-mixed text messages from Roman Urdu and English. The main goal of the proposed method is to adopt a text-mixed data augmentation for robust code-mixed text representation. We mix texts with both multi-label (track 1) and multi-class (track 2) annotations in a unified multilingual pre-trained model, i.e., XLM-RoBERTa, for both subtasks. Our results show that the proposed text-mixed method performs competitively, ranking first in both tracks, achieving an average Macro F1 score of 0.9782 on the multi-label track and of 0.9329 on the multi-class track.

pdf bib abs

FedID: Federated Interactive Distillation for Large-Scale Pretraining Language Models
Xinge Ma | Jiangming Liu | Jin Wang | Xuejie Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The growing concerns and regulations surrounding the protection of user data privacy have necessitated decentralized training paradigms. To this end, federated learning (FL) is widely studied in user-related natural language processing (NLP). However, it suffers from several critical limitations including extensive communication overhead, inability to handle heterogeneity, and vulnerability to white-box inference attacks. Federated distillation (FD) is proposed to alleviate these limitations, but its performance is faded by confirmation bias. To tackle this issue, we propose Federated Interactive Distillation (FedID), which utilizes a small amount of labeled data retained by the server to further rectify the local models during knowledge transfer. Additionally, based on the GLUE benchmark, we develop a benchmarking framework across multiple tasks with diverse data distributions to contribute to the research of FD in NLP community. Experiments show that our proposed FedID framework achieves the best results in homogeneous and heterogeneous federated scenarios. The code for this paper is available at: https://github.com/maxinge8698/FedID.

pdf bib abs

YNU-HPCC at SemEval-2023 Task 9: Pretrained Language Model for Multilingual Tweet Intimacy Analysis
Qisheng Cai | Jin Wang | Xuejie Zhang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our fine-tuned pretrained language model for task 9 (Multilingual Tweet Intimacy Analysis, MTIA) of the SemEval 2023 competition. MTIA aims to quantitatively analyze tweets in 6 languages for intimacy, giving a score from 1 to 5. The challenge of MTIA is in semantically extracting information from code-mixed texts. To alleviate this difficulty, we suggested a solution that combines attention and memory mechanisms. The preprocessed tweets are input to the XLM-T layer to get sentence embeddings and subsequently to the bidirectional GRU layer to obtain intimacy ratings. Experimental results show an improvement in the overall performance of our model in both seen and unseen languages.

pdf bib abs

YNU-HPCC at WASSA-2023 Shared Task 1: Large-scale Language Model with LoRA Fine-Tuning for Empathy Detection and Emotion Classification
Yukun Wang | Jin Wang | Xuejie Zhang
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This paper describes the system for the YNU-HPCC team in WASSA-2023 Shared Task 1: Empathy Detection and Emotion Classification. This task needs to predict the empathy, emotion, and personality of the empathic reactions. This system is mainly based on the Decoding-enhanced BERT with disentangled attention (DeBERTa) model with parameter-efficient fine-tuning (PEFT) and the Robustly Optimized BERT Pretraining Approach (RoBERTa). Low-Rank Adaptation (LoRA) fine-tuning in PEFT is used to reduce the training parameters of large language models. Moreover, back translation is introduced to augment the training dataset. This system achieved relatively good results on the competition’s official leaderboard. The code of this system is available here.

pdf bib abs

YNU-HPCC at SemEval-2023 Task 6: LEGAL-BERT Based Hierarchical BiLSTM with CRF for Rhetorical Roles Prediction
Yu Chen | You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

To understand a legal document for real-world applications, SemEval-2023 Task 6 proposes a shared Subtask A, rhetorical roles (RRs) prediction, which requires a system to automatically assign a RR label for each semantical segment in a legal text. In this paper, we propose a LEGAL-BERT based hierarchical BiLSTM model with conditional random field (CRF) for RR prediction, which primarily consists of two parts: word-level and sentence-level encoders. The word-level encoder first adopts a legal-domain pre-trained language model, LEGAL-BERT, initially word-embedding words in each sentence in a document and a word-level BiLSTM further encoding such sentence representation. The sentence-level encoder then uses an attentive pooling method for sentence embedding and a sentence-level BiLSTM for document modeling. Finally, a CRF is utilized to predict RRs for each sentence. The officially released results show that our method outperformed the baseline systems. Our team won 7th rank out of 27 participants in Subtask A.

2022

pdf bib abs

Dual-Encoder Transformers with Cross-modal Alignment for Multimodal Aspect-based Sentiment Analysis
Zhewen Yu | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Multimodal aspect-based sentiment analysis (MABSA) aims to extract the aspect terms from text and image pairs, and then analyze their corresponding sentiment. Recent studies typically use either a pipeline method or a unified transformer based on a cross-attention mechanism. However, these methods fail to explicitly and effectively incorporate the alignment between text and image. Supervised finetuning of the universal transformers for MABSA still requires a certain number of aligned image-text pairs. This study proposes a dual-encoder transformer with cross-modal alignment (DTCA). Two auxiliary tasks, including text-only extraction and text-patch alignment are introduced to enhance cross-attention performance. To align text and image, we propose an unsupervised approach which minimizes the Wasserstein distance between both modalities, forcing both encoders to produce more appropriate representations for the final extraction. Experimental results on two benchmarks demonstrate that DTCA consistently outperforms existing methods.

pdf bib abs

YNU-HPCC at SemEval-2022 Task 4: Finetuning Pretrained Language Models for Patronizing and Condescending Language Detection
Wenqiang Bai | Jin Wang | Xuejie Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes a system built in the SemEval-2022 competition. As participants in Task 4: Patronizing and Condescending Language Detection, we implemented the text sentiment classification system for two subtasks in English. Both subtasks involve determining emotions; subtask 1 requires us to determine whether the text belongs to the PCL category (single-label classification), and subtask 2 requires us to determine to which PCL category the text belongs (multi-label classification). Our system is based on the bidirectional encoder representations from transformers (BERT) model. For the single-label classification, our system applies a BertForSequenceClassification model to classify the input text. For the multi-label classification, we use the fine-tuned BERT model to extract the sentiment score of the text and a fully connected layer to classify the text into the PCL categories. Our system achieved relatively good results on the competition’s official leaderboard.

pdf bib abs

YNU-HPCC at SemEval-2022 Task 5: Multi-Modal and Multi-label Emotion Classification Based on LXMERT
Chao Han | Jin Wang | Xuejie Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes our system used in the SemEval-2022 Task5 Multimedia Automatic Misogyny Identification (MAMI). This task is to use the provided text-image pairs to classify emotions. In this paper, We propose a multi-label emotion classification model based on pre-trained LXMERT. We use Faster-RCNN to extract visual representation and utilize LXMERT’s cross-attention for multi-modal alignment. Then we use the Bilinear-interaction layer to fuse these features. Our experimental results surpass the F₁ score of baseline. For Sub-task A, our F₁ score is 0.662 and Sub-task B’s F₁ score is 0.633. The code of this study is available on GitHub.

pdf bib abs

Knowledge Distillation with Reptile Meta-Learning for Pretrained Language Model Compression
Xinge Ma | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 29th International Conference on Computational Linguistics

The billions, and sometimes even trillions, of parameters involved in pre-trained language models significantly hamper their deployment in resource-constrained devices and real-time applications. Knowledge distillation (KD) can transfer knowledge from the original model (i.e., teacher) into a compact model (i.e., student) to achieve model compression. However, previous KD methods have usually frozen the teacher and applied its immutable output feature maps as soft labels to guide the student’s training. Moreover, the goal of the teacher is to achieve the best performance on downstream tasks rather than knowledge transfer. Such a fixed architecture may limit the teacher’s teaching and student’s learning abilities. Herein, a knowledge distillation method with reptile meta-learning is proposed to facilitate the transfer of knowledge from the teacher to the student. The teacher can continuously meta-learn the student’s learning objective to adjust its parameters for maximizing the student’s performance throughout the distillation process. In this way, the teacher learns to teach, produces more suitable soft labels, and transfers more appropriate knowledge to the student, resulting in improved performance. Unlike previous KD using meta-learning, the proposed method only needs to calculate the first-order derivatives to update the teacher, leading to lower computational cost but better convergence. Extensive experiments on the GLUE benchmark show the competitive performance achieved by the proposed method. For reproducibility, the code for this paper is available at: https://github.com/maxinge8698/ReptileDistil.

pdf bib abs

YNU-HPCC at SemEval-2022 Task 2: Representing Multilingual Idiomaticity based on Contrastive Learning
Kuanghong Liu | Jin Wang | Xuejie Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper will present the methods we use as the YNU-HPCC team in the SemEval-2022 Task 2, Multilingual Idiomaticity Detection and Sentence Embedding. We are involved in two subtasks, including four settings. In subtask B of sentence representation, we used novel approaches with ideas of contrastive learning to optimize model, where method of CoSENT was used in the pre-train setting, and triplet loss and multiple negatives ranking loss functions in fine-tune setting. We had achieved very competitive results on the final released test datasets. However, for subtask A of idiomaticity detection, we simply did a few explorations and experiments based on the xlm-RoBERTa model. Sentence concatenated with additional MWE as inputs did well in a one-shot setting. Sentences containing context had a poor performance on final released test data in zero-shot setting even if we attempted to extract effective information from CLS tokens of hidden layers.

pdf bib abs

YNU-HPCC at SemEval-2022 Task 8: Transformer-based Ensemble Model for Multilingual News Article Similarity
Zihan Nai | Jin Wang | Xuejie Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes the system submitted by our team (YNU-HPCC) to SemEval-2022 Task 8: Multilingual news article similarity. This task requires participants to develop a system which could evaluate the similarity between multilingual news article pairs. We propose an approach that relies on Transformers to compute the similarity between pairs of news. We tried different models namely BERT, ALBERT, ELECTRA, RoBERTa, M-BERT and Compared their results. At last, we chose M-BERT as our System, which has achieved the best Pearson Correlation Coefficient score of 0.738.

pdf bib abs

YNU-HPCC at SemEval-2022 Task 6: Transformer-based Model for Intended Sarcasm Detection in English and Arabic
Guangmin Zheng | Jin Wang | Xuejie Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In this paper, we (a YNU-HPCC team) describe the system we built in the SemEval-2022 competition. As participants in Task 6 (titled “iSarcasmEval: Intended Sarcasm Detection In English and Arabic”), we implement the sentiment system for all three subtasks in English and Arabic. All subtasks involve the detection of sarcasm (binary and multilabel classification) and the determination of the sarcastic text location (sentence pair classification). Our system primarily applies the sequence classification model of a bidirectional encoder representation from a transformer (BERT). The BERT is used to extract sentence information from both directions for downstream classification tasks. A single basic model is used for single-sentence and sentence-pair binary classification tasks. For the multilabel task, the Label-Powerset method and binary cross-entropy loss function with weights are used. Our system exhibits competitive performance, obtaining 12/43 (21/32), 11/22, and 3/16 (8/13) rankings in the three official rankings for English (Arabic).

pdf bib abs

Accelerating Inference for Pretrained Language Models by Unified Multi-Perspective Early Exiting
Jun Kong | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 29th International Conference on Computational Linguistics

Conditional computation algorithms, such as the early exiting (EE) algorithm, can be applied to accelerate the inference of pretrained language models (PLMs) while maintaining competitive performance on resource-constrained devices. However, this approach is only applied to the vertical architecture to decide which layers should be used for inference. Conversely, the operation of the horizontal perspective is ignored, and the determination of which tokens in each layer should participate in the computation fails, leading to a high redundancy for adaptive inference. To address this limitation, a unified horizontal and vertical multi-perspective early exiting (MPEE) framework is proposed in this study to accelerate the inference of transformer-based models. Specifically, the vertical architecture uses recycling EE classifier memory and weighted self-distillation to enhance the performance of the EE classifiers. Then, the horizontal perspective uses recycling class attention memory to emphasize the informative tokens. Conversely, the tokens with less information are truncated by weighted fusion and isolated from the following computation. Based on this, both horizontal and vertical EE are unified to obtain a better tradeoff between performance and efficiency. Extensive experimental results show that MPEE can achieve higher acceleration inference with competent performance than existing competitive methods.

pdf bib abs

YNU-HPCC at ROCLING 2022 Shared Task: A Transformer-based Model with Focal Loss and Regularization Dropout for Chinese Healthcare Named Entity Recognition
Xiang Luo | Jin Wang | Xuejie Zhang
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Named Entity Recognition (NER) is a fundamental task in information extraction that locates the mentions of named entities and classifies them in unstructured texts. Previous studies typically used hidden Markov model (HMM) and conditional random fields (CRF) for NER. To learn long-distance dependencies in text, recurrent neural networks, e.g., LSTM and GRU can extract the semantic features for each token with a sequential manner. Based on Transformers, this paper describes the contribution to ROCLING-2022 Share Task. This paper adopts a transformer-based model with focal Loss and regularization dropout. The focal loss is to overcome the uneven distribution of the label. The regularization dropout (r-drop) is to address the problem of vocabulary and descriptions that are too domain-specific. The ensemble learning is to improve the performance of the model. Comparative experiments were conducted on dev set to select the model with the best performance for submission. That is, BERT model with BiLSTM-CRF, focal loss and R-Drop has achieved the best F1-score of 0.7768 and rank the 4th place.

2021

pdf bib abs

YNU-HPCC at SemEval-2021 Task 11: Using a BERT Model to Extract Contributions from NLP Scholarly Articles
Xinge Ma | Jin Wang | Xuejie Zhang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes the system we built as the YNU-HPCC team in the SemEval-2021 Task 11: NLPContributionGraph. This task involves first identifying sentences in the given natural language processing (NLP) scholarly articles that reflect research contributions through binary classification; then identifying the core scientific terms and their relation phrases from these contribution sentences by sequence labeling; and finally, these scientific terms and relation phrases are categorized, identified, and organized into subject-predicate-object triples to form a knowledge graph with the help of multiclass classification and multi-label classification. We developed a system for this task using a pre-trained language representation model called BERT that stands for Bidirectional Encoder Representations from Transformers, and achieved good results. The average F1-score for Evaluation Phase 2, Part 1 was 0.4562 and ranked 7th, and the average F1-score for Evaluation Phase 2, Part 2 was 0.6541, and also ranked 7th.

pdf bib abs

YNU-HPCC at SemEval-2021 Task 10: Using a Transformer-based Source-Free Domain Adaptation Model for Semantic Processing
Zhewen Yu | Jin Wang | Xuejie Zhang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Data sharing restrictions are common in NLP datasets. The purpose of this task is to develop a model trained in a source domain to make predictions for a target domain with related domain data. To address the issue, the organizers provided the models that fine-tuned a large number of source domain data on pre-trained models and the dev data for participants. But the source domain data was not distributed. This paper describes the provided model to the NER (Name entity recognition) task and the ways to develop the model. As a little data provided, pre-trained models are suitable to solve the cross-domain tasks. The models fine-tuned by large number of another domain could be effective in new domain because the task had no change.

pdf bib

MA-BERT: Learning Representation by Incorporating Multi-Attribute Knowledge in Transformers
You Zhang | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs

YNU-HPCC at SemEval-2021 Task 6: Combining ALBERT and Text-CNN for Persuasion Detection in Texts and Images
Xingyu Zhu | Jin Wang | Xuejie Zhang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

In recent years, memes combining image and text have been widely used in social media, and memes are one of the most popular types of content used in online disinformation campaigns. In this paper, our study on the detection of persuasion techniques in texts and images in SemEval-2021 Task 6 is summarized. For propaganda technology detection in text, we propose a combination model of both ALBERT and Text CNN for text classification, as well as a BERT-based multi-task sequence labeling model for propaganda technology coverage span detection. For the meme classification task involved in text understanding and visual feature extraction, we designed a parallel channel model divided into text and image channels. Our method achieved a good performance on subtasks 1 and 3. The micro F1-scores of 0.492, 0.091, and 0.446 achieved on the test sets of the three subtasks ranked 12th, 7th, and 11th, respectively, and all are higher than the baseline model.

pdf bib abs

YNU-HPCC at SemEval-2021 Task 5: Using a Transformer-based Model with Auxiliary Information for Toxic Span Detection
Ruijun Chen | Jin Wang | Xuejie Zhang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text. In this paper, a transformer-based model with auxiliary information is proposed for SemEval-2021 Task 5. The proposed model was implemented based on the BERT-CRF architecture. It consists of three parts: a transformer-based model that can obtain the token representation, an auxiliary information module that combines features from different layers, and an output layer used for the classification. Various BERT-based models, such as BERT, ALBERT, RoBERTa, and XLNET, were used to learn contextual representations. The predictions of these models were assembled to improve the sequence labeling tasks by using a voting strategy. Experimental results showed that the introduced auxiliary information can improve the performance of toxic spans detection. The proposed model ranked 5th of 91 in the competition. The code of this study is available at https://github.com/Chenrj233/semeval2021_task5

2020

pdf bib abs

YNU-HPCC at SemEval-2020 Task 7: Using an Ensemble BiGRU Model to Evaluate the Humor of Edited News Titles
Joseph Tomasulo | Jin Wang | Xuejie Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes an ensemble model designed for Semeval-2020 Task 7. The task is based on the Humicroedit dataset that is comprised of news titles and one-word substitutions designed to make them humorous. We use BERT, FastText, Elmo, and Word2Vec to encode these titles then pass them to a bidirectional gated recurrent unit (BiGRU) with attention. Finally, we used XGBoost on the concatenation of the results of the different models to make predictions.

pdf bib abs

YNU-HPCC at SemEval-2020 Task 10: Using a Multi-granularity Ordinal Classification of the BiLSTM Model for Emphasis Selection
Dawei Liao | Jin Wang | Xuejie Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this study, we propose a multi-granularity ordinal classification method to address the problem of emphasis selection. In detail, the word embedding is learned from Embeddings from Language Model (ELMO) to extract feature vector representation. Then, the ordinal classifica-tions are implemented on four different multi-granularities to approximate the continuous em-phasize values. Comparative experiments were conducted to compare the model with baseline in which the problem is transformed to label distribution problem.

pdf bib abs

YNU-HPCC at SemEval-2020 Task 11: LSTM Network for Detection of Propaganda Techniques in News Articles
Jiaxu Dao | Jin Wang | Xuejie Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper summarizes our studies on propaganda detection techniques for news articles in the SemEval-2020 task 11. This task is divided into the SI and TC subtasks. We implemented the GloVe word representation, the BERT pretraining model, and the LSTM model architecture to accomplish this task. Our approach achieved good results for both the SI and TC subtasks. The macro- F 1 - score for the SI subtask is 0.406, and the micro- F 1 - score for the TC subtask is 0.505. Our method significantly outperforms the officially released baseline method, and the SI and TC subtasks rank 17th and 22nd, respectively, for the test set. This paper also compares the performances of different deep learning model architectures, such as the Bi-LSTM, LSTM, BERT, and XGBoost models, on the detection of news promotion techniques.

pdf bib abs

Graph Attention Network with Memory Fusion for Aspect-level Sentiment Analysis
Li Yuan | Jin Wang | Liang-Chih Yu | Xuejie Zhang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Aspect-level sentiment analysis(ASC) predicts each specific aspect term’s sentiment polarity in a given text or review. Recent studies used attention-based methods that can effectively improve the performance of aspect-level sentiment analysis. These methods ignored the syntactic relationship between the aspect and its corresponding context words, leading the model to focus on syntactically unrelated words mistakenly. One proposed solution, the graph convolutional network (GCN), cannot completely avoid the problem. While it does incorporate useful information about syntax, it assigns equal weight to all the edges between connected words. It may still incorrectly associate unrelated words to the target aspect through the iterations of graph convolutional propagation. In this study, a graph attention network with memory fusion is proposed to extend GCN’s idea by assigning different weights to edges. Syntactic constraints can be imposed to block the graph convolutional propagation of unrelated words. A convolutional layer and a memory fusion were applied to learn and exploit multiword relations and draw different weights of words to improve performance further. Experimental results on five datasets show that the proposed method yields better performance than existing methods.

pdf bib abs

YNU-HPCC at SemEval-2020 Task 8: Using a Parallel-Channel Model for Memotion Analysis
Li Yuan | Jin Wang | Xuejie Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

this paper proposed a parallel-channel model to process the textual and visual information in memes and then analyze the sentiment polarity of memes. In the shared task of identifying and categorizing memes, we preprocess the dataset according to the language behaviors on social media. Then, we adapt and fine-tune the Bidirectional Encoder Representations from Transformers (BERT), and two types of convolutional neural network models (CNNs) were used to extract the features from the pictures. We applied an ensemble model that combined the BiLSTM, BIGRU, and Attention models to perform cross domain suggestion mining. The officially released results show that our system performs better than the baseline algorithm

pdf bib abs

HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for Sentiment Analysis of Code-Mixed Text
Jun Kong | Jin Wang | Xuejie Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

It is fairly common to use code-mixing on a social media platform to express opinions and emotions in multilingual societies. The purpose of this task is to detect the sentiment of code-mixed social media text. Code-mixed text poses a great challenge for the traditional NLP system, which currently uses monolingual resources to deal with the problem of multilingual mixing. This task has been solved in the past using lexicon lookup in respective sentiment dictionaries and using a long short-term memory (LSTM) neural network for monolingual resources. In this paper, we present a system that uses a bilingual vector gating mechanism for bilingual resources to complete the task. The model consists of two main parts: the vector gating mechanism, which combines the character and word levels, and the attention mechanism, which extracts the important emotional parts of the text. The results show that the proposed system outperforms the baseline algorithm. We achieved fifth place in Spanglish and 19th place in Hinglish.

2019

pdf bib abs

YNU_DYX at SemEval-2019 Task 9: A Stacked BiLSTM for Suggestion Mining Classification
Yunxia Ding | Xiaobing Zhou | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

In this paper we describe a deep-learning system that competed as SemEval 2019 Task 9-SubTask A: Suggestion Mining from Online Reviews and Forums. We use Word2Vec to learn the distributed representations from sentences. This system is composed of a Stacked Bidirectional Long-Short Memory Network (SBiLSTM) for enriching word representations before and after the sequence relationship with context. We perform an ensemble to improve the effectiveness of our model. Our official submission results achieve an F1-score 0.5659.

pdf bib abs

YNU-HPCC at SemEval-2019 Task 6: Identifying and Categorising Offensive Language on Twitter
Chengjin Zhou | Jin Wang | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

This document describes the submission of team YNU-HPCC to SemEval-2019 for three Sub-tasks of Task 6: Sub-task A, Sub-task B, and Sub-task C. We have submitted four systems to identify and categorise offensive language. The first subsystem is an attention-based 2-layer bidirectional long short-term memory (BiLSTM). The second subsystem is a voting ensemble of four different deep learning architectures. The third subsystem is a stacking ensemble of four different deep learning architectures. Finally, the fourth subsystem is a bidirectional encoder representations from transformers (BERT) model. Among our models, in Sub-task A, our first subsystem performed the best, ranking 16th among 103 teams; in Sub-task B, the second subsystem performed the best, ranking 12th among 75 teams; in Sub-task C, the fourth subsystem performed best, ranking 4th among 65 teams.

pdf bib abs

YNU-HPCC at SemEval-2019 Task 8: Using A LSTM-Attention Model for Fact-Checking in Community Forums
Peng Liu | Jin Wang | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

We propose a system that uses a long short-term memory with attention mechanism (LSTM-Attention) model to complete the task. The LSTM-Attention model uses two LSTM to extract the features of the question and answer pair. Then, each of the features is sequentially composed using the attention mechanism, concatenating the two vectors into one. Finally, the concatenated vector is used as input for the MLP and the MLP’s output layer uses the softmax function to classify the provided answers into three categories. This model is capable of extracting the features of the question and answer pair well. The results show that the proposed system outperforms the baseline algorithm.

pdf bib abs

YNU-HPCC at SemEval-2019 Task 9: Using a BERT and CNN-BiLSTM-GRU Model for Suggestion Mining
Ping Yue | Jin Wang | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

Consumer opinions towards commercial entities are generally expressed through online reviews, blogs, and discussion forums. These opinions largely express positive and negative sentiments towards a given entity,but also tend to contain suggestions for improving the entity. In this task, we extract suggestions from given the unstructured text, compared to the traditional opinion mining systems. Such suggestion mining is more applicability and extends capabilities.

pdf bib abs

YNUWB at SemEval-2019 Task 6: K-max pooling CNN with average meta-embedding for identifying offensive language
Bin Wang | Xiaobing Zhou | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the system submitted to SemEval 2019 Task 6: OffensEval 2019. The task aims to identify and categorize offensive language in social media, we only participate in Sub-task A, which aims to identify offensive language. In order to address this task, we propose a system based on a K-max pooling convolutional neural network model, and use an argument for averaging as a valid meta-embedding technique to get a metaembedding. Finally, we also use a cyclic learning rate policy to improve model performance. Our model achieves a Macro F1-score of 0.802 (ranked 9/103) in the Sub-task A.

pdf bib abs

YUN-HPCC at SemEval-2019 Task 3: Multi-Step Ensemble Neural Network for Sentiment Analysis in Textual Conversation
Dawei Li | Jin Wang | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our approach to the sentiment analysis of Twitter textual conversations based on deep learning. We analyze the syntax, abbreviations, and informal-writing of Twitter; and perform perfect data preprocessing on the data to convert them to normative text. We apply a multi-step ensemble strategy to solve the problem of extremely unbalanced data in the training set. This is achieved by taking the GloVe and Elmo word vectors as input into a combination model with four different deep neural networks. The experimental results from the development dataset demonstrate that the proposed model exhibits a strong generalization ability. For evaluation on the best dataset, we integrated the results using the stacking ensemble learning approach and achieved competitive results. According to the final official review, the results of our model ranked 10th out of 165 teams.

pdf bib abs

YNU_DYX at SemEval-2019 Task 5: A Stacked BiGRU Model Based on Capsule Network in Detection of Hate
Yunxia Ding | Xiaobing Zhou | Xuejie Zhang
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our system designed for SemEval 2019 Task 5 “Shared Task on Multilingual Detection of Hate”.We only participate in subtask-A in English. To address this task, we present a stacked BiGRU model based on a capsule network system. In or- der to convert the tweets into corresponding vector representations and input them into the neural network, we use the fastText tools to get word representations. Then, the sentence representation is enriched by stacked Bidirectional Gated Recurrent Units (BiGRUs) and used as the input of capsule network. Our system achieves an average F1-score of 0.546 and ranks 3rd in the subtask-A in English.

pdf bib abs

Investigating Dynamic Routing in Tree-Structured LSTM for Sentiment Analysis
Jin Wang | Liang-Chih Yu | K. Robert Lai | Xuejie Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Deep neural network models such as long short-term memory (LSTM) and tree-LSTM have been proven to be effective for sentiment analysis. However, sequential LSTM is a bias model wherein the words in the tail of a sentence are more heavily emphasized than those in the header for building sentence representations. Even tree-LSTM, with useful structural information, could not avoid the bias problem because the root node will be dominant and the nodes in the bottom of the parse tree will be less emphasized even though they may contain salient information. To overcome the bias problem, this study proposes a capsule tree-LSTM model, introducing a dynamic routing algorithm as an aggregation layer to build sentence representation by assigning different weights to nodes according to their contributions to prediction. Experiments on Stanford Sentiment Treebank (SST) for sentiment classification and EmoBank for regression show that the proposed method improved the performance of tree-LSTM and other neural network models. In addition, the deeper the tree structure, the bigger the improvement.

2018

pdf bib abs

YNU-HPCC at SemEval-2018 Task 2: Multi-ensemble Bi-GRU Model with Attention Mechanism for Multilingual Emoji Prediction
Nan Wang | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes our approach to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. We normalized text-based tweets during pre-processing, following which we utilized a bi-directional gated recurrent unit with an attention mechanism to build our base model. Multi-models with or without class weights were trained for the ensemble methods. We boosted models without class weights, and only strong boost classifiers were identified. In our system, not only was a boosting method used, but we also took advantage of the voting ensemble method to enhance our final system result. Our method demonstrated an obvious improvement of approximately 3% of the macro F1 score in English and 2% in Spanish.

pdf bib abs

YNU-HPCC at Semeval-2018 Task 11: Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense Knowledge
Hang Yuan | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

This shared task is a typical question answering task. Compared with the normal question and answer system, it needs to give the answer to the question based on the text provided. The essence of the problem is actually reading comprehension. Typically, there are several questions for each text that correspond to it. And for each question, there are two candidate answers (and only one of them is correct). To solve this problem, the usual approach is to use convolutional neural networks (CNN) and recurrent neural network (RNN) or their improved models (such as long short-term memory (LSTM)). In this paper, an attention-based CNN-LSTM model is proposed for this task. By adding an attention mechanism and combining the two models, this experimental result has been significantly improved.

pdf bib abs

YNU-HPCC at SemEval-2018 Task 3: Ensemble Neural Network Models for Irony Detection on Twitter
Bo Peng | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describe the system we proposed to participate the first year of Irony detection in English tweets competition. Previous works demonstrate that LSTMs models have achieved remarkable performance in natural language processing; besides, combining multiple classification from various individual classifiers in general is more powerful than a single classification. In order to obtain more precision classification of irony detection, our system trained several individual neural network classifiers and combined their results according to the ensemble-learning algorithm.

pdf bib abs

YNU-HPCC at SemEval-2018 Task 1: BiLSTM with Attention based Sentiment Analysis for Affect in Tweets
You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

We implemented the sentiment system in all five subtasks for English and Spanish. All subtasks involve emotion or sentiment intensity prediction (regression and ordinal classification) and emotions determining (multi-labels classification). The useful BiLSTM (Bidirectional Long-Short Term Memory) model with attention mechanism was mainly applied for our system. We use BiLSTM in order to get word information extracted from both directions. The attention mechanism was used to find the contribution of each word for improving the scores. Furthermore, based on BiLSTMATT (BiLSTM with attention mechanism) a few deep-learning algorithms were employed for different subtasks. For regression and ordinal classification tasks we used domain adaptation and ensemble learning methods to leverage base model. While a single base model was used for multi-labels task.

pdf bib abs

YNU-HPCC at SemEval-2018 Task 12: The Argument Reasoning Comprehension Task Using a Bi-directional LSTM with Attention Model
Quanlei Liao | Xutao Yang | Jin Wang | Xuejie Zhang
Proceedings of the 12th International Workshop on Semantic Evaluation

An argument is divided into two parts, the claim and the reason. To obtain a clearer conclusion, some additional explanation is required. In this task, the explanations are called warrants. This paper introduces a bi-directional long short term memory (Bi-LSTM) with an attention model to select a correct warrant from two to explain an argument. We address this question as a question-answering system. For each warrant, the model produces a probability that it is correct. Finally, the system chooses the highest correct probability as the answer. Ensemble learning is used to enhance the performance of the model. Among all of the participants, we ranked 15th on the test results.

2017

pdf bib abs

YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
Quanlei Liao | Jin Wang | Jinnan Yang | Xuejie Zhang
Proceedings of the IJCNLP 2017, Shared Tasks

Building a system to detect Chinese grammatical errors is a challenge for natural-language processing researchers. As Chinese learners are increasing, developing such a system can help them study Chinese more easily. This paper introduces a bi-directional long short-term memory (BiLSTM) - conditional random field (CRF) model to produce the sequences that indicate an error type for every position of a sentence, since we regard Chinese grammatical error diagnosis (CGED) as a sequence-labeling problem.

pdf bib abs

YNU-HPCC at SemEval 2017 Task 4: Using A Multi-Channel CNN-LSTM Model for Sentiment Classification
Haowei Zhang | Jin Wang | Jixian Zhang | Xuejie Zhang
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we propose a multi-channel convolutional neural network-long short-term memory (CNN-LSTM) model that consists of two parts: multi-channel CNN and LSTM to analyze the sentiments of short English messages from Twitter. Un-like a conventional CNN, the proposed model applies a multi-channel strategy that uses several filters of different length to extract active local n-gram features in different scales. This information is then sequentially composed using LSTM. By combining both CNN and LSTM, we can consider both local information within tweets and long-distance dependency across tweets in the classification process. Officially released results show that our system outperforms the baseline algo-rithm.

pdf bib abs

Refining Word Embeddings for Sentiment Analysis
Liang-Chih Yu | Jin Wang | K. Robert Lai | Xuejie Zhang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Word embeddings that can capture semantic and syntactic information from contexts have been extensively used for various natural language processing tasks. However, existing methods for learning context-based word embeddings typically fail to capture sufficient sentiment information. This may result in words with similar vector representations having an opposite sentiment polarity (e.g., good and bad), thus degrading sentiment analysis performance. Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe). The refinement model is based on adjusting the vector representations of words such that they can be closer to both semantically and sentimentally similar words and further away from sentimentally dissimilar words. Experimental results show that the proposed method can improve conventional word embeddings and outperform previously proposed sentiment embeddings for both binary and fine-grained classification on Stanford Sentiment Treebank (SST).

pdf bib abs

YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English
Nan Wang | Jin Wang | Xuejie Zhang
Proceedings of the IJCNLP 2017, Shared Tasks

This paper describes our submission to IJCNLP 2017 shared task 4, for predicting the tags of unseen customer feedback sentences, such as comments, complaints, bugs, requests, and meaningless and undetermined statements. With the use of a neural network, a large number of deep learning methods have been developed, which perform very well on text classification. Our ensemble classification model is based on a bi-directional gated recurrent unit and an attention mechanism which shows a 3.8% improvement in classification accuracy. To enhance the model performance, we also compared it with several word-embedding models. The comparative results show that a combination of both word2vec and GloVe achieves the best performance.

pdf bib abs

YNU-HPCC at EmoInt-2017: Using a CNN-LSTM Model for Sentiment Intensity Prediction
You Zhang | Hang Yuan | Jin Wang | Xuejie Zhang
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper, we present a system that uses a convolutional neural network with long short-term memory (CNN-LSTM) model to complete the task. The CNN-LSTM model has two combined parts: CNN extracts local n-gram features within tweets and LSTM composes the features to capture long-distance dependency across tweets. Additionally, we used other three models (CNN, LSTM, BiLSTM) as baseline algorithms. Our introduced model showed good performance in the experimental results.

pdf bib abs

YNU-HPCC at IJCNLP-2017 Task 5: Multi-choice Question Answering in Exams Using an Attention-based LSTM Model
Hang Yuan | You Zhang | Jin Wang | Xuejie Zhang
Proceedings of the IJCNLP 2017, Shared Tasks

A shared task is a typical question answering task that aims to test how accurately the participants can answer the questions in exams. Typically, for each question, there are four candidate answers, and only one of the answers is correct. The existing methods for such a task usually implement a recurrent neural network (RNN) or long short-term memory (LSTM). However, both RNN and LSTM are biased models in which the words in the tail of a sentence are more dominant than the words in the header. In this paper, we propose the use of an attention-based LSTM (AT-LSTM) model for these tasks. By adding an attention mechanism to the standard LSTM, this model can more easily capture long contextual information.

2016

pdf bib

Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model
Jin Wang | Liang-Chih Yu | K. Robert Lai | Xuejie Zhang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib

pdf bib abs

Chinese Grammatical Error Diagnosis Using Single Word Embedding
Jinnan Yang | Bo Peng | Jin Wang | Jixian Zhang | Xuejie Zhang
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Automatic grammatical error detection for Chinese has been a big challenge for NLP researchers. Due to the formal and strict grammar rules in Chinese, it is hard for foreign students to master Chinese. A computer-assisted learning tool which can automatically detect and correct Chinese grammatical errors is necessary for those foreign students. Some of the previous works have sought to identify Chinese grammatical errors using template- and learning-based methods. In contrast, this study introduced convolutional neural network (CNN) and long-short term memory (LSTM) for the shared task of Chinese Grammatical Error Diagnosis (CGED). Different from traditional word-based embedding, single word embedding was used as input of CNN and LSTM. The proposed single word embedding can capture both semantic and syntactic information to detect those four type grammatical error. In experimental evaluation, the recall and f1-score of our submitted results Run1 of the TOCFL testing data ranked the fourth place in all submissions in detection-level.