Ming Wang


pdf bib
EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing
Chengyu Wang | Minghui Qiu | Taolin Zhang | Tingting Liu | Lei Li | Jianing Wang | Ming Wang | Jun Huang | Wei Lin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Pre-Trained Models (PTMs) have reshaped the development of Natural Language Processing (NLP) and achieved significant improvement in various benchmarks. Yet, it is not easy for industrial practitioners to obtain high-performing PTM-based models without a large amount of labeled training data and deploy them online with fast inference speed. To bridge this gap, EasyNLP is designed to make it easy to build NLP applications, which supports a comprehensive suite of NLP algorithms. It further features knowledge-enhanced pre-training, knowledge distillation and few-shot learning functionalities, and provides a unified framework of model training, inference and deployment for real-world applications. EasyNLP has powered over ten business units within Alibaba Group and is seamlessly integrated to the Platform of AI (PAI) products on Alibaba Cloud. The source code of EasyNLP is released at GitHub (https://github.com/alibaba/EasyNLP).


pdf bib
Word Sense Disambiguation: Towards Interactive Context Exploitation from Both Word and Sense Perspectives
Ming Wang | Yinglin Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Lately proposed Word Sense Disambiguation (WSD) systems have approached the estimated upper bound of the task on standard evaluation benchmarks. However, these systems typically implement the disambiguation of words in a document almost independently, underutilizing sense and word dependency in context. In this paper, we convert the nearly isolated decisions into interrelated ones by exposing senses in context when learning sense embeddings in a similarity-based Sense Aware Context Exploitation (SACE) architecture. Meanwhile, we enhance the context embedding learning with selected sentences from the same document, rather than utilizing only the sentence where each ambiguous word appears. Experiments on both English and multilingual WSD datasets have shown the effectiveness of our approach, surpassing previous state-of-the-art by large margins (3.7% and 1.2% respectively), especially on few-shot (14.3%) and zero-shot (35.9%) scenarios.

pdf bib
Enhancing the Context Representation in Similarity-based Word Sense Disambiguation
Ming Wang | Jianzhang Zhang | Yinglin Wang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

In previous similarity-based WSD systems, studies have allocated much effort on learning comprehensive sense embeddings using contextual representations and knowledge sources. However, the context embedding of an ambiguous word is learned using only the sentence where the word appears, neglecting its global context. In this paper, we investigate the contribution of both word-level and sense-level global context of an ambiguous word for disambiguation. Experiments have shown that the Context-Oriented Embedding (COE) can enhance a similarity-based system’s performance on WSD by relatively large margins, achieving state-of-the-art on all-words WSD benchmarks in knowledge-based category.


pdf bib
A Synset Relation-enhanced Framework with a Try-again Mechanism for Word Sense Disambiguation
Ming Wang | Yinglin Wang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Contextual embeddings are proved to be overwhelmingly effective to the task of Word Sense Disambiguation (WSD) compared with other sense representation techniques. However, these embeddings fail to embed sense knowledge in semantic networks. In this paper, we propose a Synset Relation-Enhanced Framework (SREF) that leverages sense relations for both sense embedding enhancement and a try-again mechanism that implements WSD again, after obtaining basic sense embeddings from augmented WordNet glosses. Experiments on all-words and lexical sample datasets show that the proposed system achieves new state-of-the-art results, defeating previous knowledge-based systems by at least 5.5 F1 measure. When the system utilizes sense embeddings learned from SemCor, it outperforms all previous supervised systems with only 20% SemCor data.


pdf bib
YNUDLG at SemEval-2017 Task 4: A GRU-SVM Model for Sentiment Classification and Quantification in Twitter
Ming Wang | Biao Chu | Qingxun Liu | Xiaobing Zhou
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Sentiment analysis is one of the central issues in Natural Language Processing and has become more and more important in many fields. Typical sentiment analysis classifies the sentiment of sentences into several discrete classes (e.g.,positive or negative). In this paper we describe our deep learning system(combining GRU and SVM) to solve both two-, three- and five-tweet polarity classifications. We first trained a gated recurrent neural network using pre-trained word embeddings, then we extracted features from GRU layer and input these features into support vector machine to fulfill both the classification and quantification subtasks. The proposed approach achieved 37th, 19th, and 14rd places in subtasks A, B and C, respectively.