Yang Fang


2024

pdf bib
Advancing Arabic Sentiment Analysis: ArSen Benchmark and the Improved Fuzzy Deep Hybrid Network
Yang Fang | Cheng Xu | Shuhao Guan | Nan Yan | Yuke Mei
Proceedings of the 28th Conference on Computational Natural Language Learning

Sentiment analysis is pivotal in Natural Language Processing for understanding opinions and emotions in text. While advancements in Sentiment analysis for English are notable, Arabic Sentiment Analysis (ASA) lags, despite the growing Arabic online user base. Existing ASA benchmarks are often outdated and lack comprehensive evaluation capabilities for state-of-the-art models. To bridge this gap, we introduce ArSen, a meticulously annotated COVID-19-themed Arabic dataset, and the IFDHN, a novel model incorporating fuzzy logic for enhanced sentiment classification. ArSen provides a contemporary, robust benchmark, and IFDHN achieves state-of-the-art performance on ASA tasks. Comprehensive evaluations demonstrate the efficacy of IFDHN using the ArSen dataset, highlighting future research directions in ASA.

2022

pdf bib
Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training
Peixin Huang | Xiang Zhao | Minghao Hu | Yang Fang | Xinyi Li | Weidong Xiao
Findings of the Association for Computational Linguistics: ACL 2022

Nested named entity recognition (NER) is a task in which named entities may overlap with each other. Span-based approaches regard nested NER as a two-stage span enumeration and classification task, thus having the innate ability to handle this task. However, they face the problems of error propagation, ignorance of span boundary, difficulty in long entity recognition and requirement on large-scale annotated data. In this paper, we propose Extract-Select, a span selection framework for nested NER, to tackle these problems. Firstly, we introduce a span selection framework in which nested entities with different input categories would be separately extracted by the extractor, thus naturally avoiding error propagation in two-stage span-based approaches. In the inference phase, the trained extractor selects final results specific to the given entity category. Secondly, we propose a hybrid selection strategy in the extractor, which not only makes full use of span boundary but also improves the ability of long entity recognition. Thirdly, we design a discriminator to evaluate the extraction result, and train both extractor and discriminator with generative adversarial training (GAT). The use of GAT greatly alleviates the stress on the dataset size. Experimental results on four benchmark datasets demonstrate that Extract-Select outperforms competitive nested NER models, obtaining state-of-the-art results. The proposed model also performs well when less labeled data are given, proving the effectiveness of GAT.