Shihan Wang


2022

pdf bib
A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning
Yang Zhao | Hua Qin | Wang Zhenyu | Changxi Zhu | Shihan Wang
Findings of the Association for Computational Linguistics: NAACL 2022

Training a deep reinforcement learning-based dialogue policy with brute-force random sampling is costly. A new training paradigm was proposed to improve learning performance and efficiency by combining curriculum learning. However, attempts in the field of dialogue policy are very limited due to the lack of reliable evaluation of difficulty scores of dialogue tasks and the high sensitivity to the mode of progression through dialogue tasks. In this paper, we present a novel versatile adaptive curriculum learning (VACL) framework, which presents a substantial step toward applying automatic curriculum learning on dialogue policy tasks. It supports evaluating the difficulty of dialogue tasks only using the learning experiences of dialogue policy and skip-level selection according to their learning needs to maximize the learning efficiency. Moreover, an attractive feature of VACL is the construction of a generic, elastic global curriculum while training a good dialogue policy that could guide different dialogue policy learning without extra effort on re-training. The superiority and versatility of VACL are validated on three public dialogue datasets.

2021

pdf bib
Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy
Yangyang Zhao | Zhenyu Wang | Changxi Zhu | Shihan Wang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Deep reinforcement learning has shown great potential in training dialogue policies. However, its favorable performance comes at the cost of many rounds of interaction. Most of the existing dialogue policy methods rely on a single learning system, while the human brain has two specialized learning and memory systems, supporting to find good solutions without requiring copious examples. Inspired by the human brain, this paper proposes a novel complementary policy learning (CPL) framework, which exploits the complementary advantages of the episodic memory (EM) policy and the deep Q-network (DQN) policy to achieve fast and effective dialogue policy learning. In order to coordinate between the two policies, we proposed a confidence controller to control the complementary time according to their relative efficacy at different stages. Furthermore, memory connectivity and time pruning are proposed to guarantee the flexible and adaptive generalization of the EM policy in dialog tasks. Experimental results on three dialogue datasets show that our method significantly outperforms existing methods relying on a single learning system.

2020

pdf bib
Public Sentiment on Governmental COVID-19 Measures in Dutch Social Media
Shihan Wang | Marijn Schraagen | Erik Tjong Kim Sang | Mehdi Dastani
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

Public sentiment (the opinion, attitude or feeling that the public expresses) is a factor of interest for government, as it directly influences the implementation of policies. Given the unprecedented nature of the COVID-19 crisis, having an up-to-date representation of public sentiment on governmental measures and announcements is crucial. In this paper, we analyse Dutch public sentiment on governmental COVID-19 measures from text data collected across three online media sources (Twitter, Reddit and Nu.nl) from February to September 2020. We apply sentiment analysis methods to analyse polarity over time, as well as to identify stance towards two specific pandemic policies regarding social distancing and wearing face masks. The presented preliminary results provide valuable insights into the narratives shown in vast social media text data, which help understand the influence of COVID-19 measures on the general public.

pdf bib
METNet: A Mutual Enhanced Transformation Network for Aspect-based Sentiment Analysis
Bin Jiang | Jing Hou | Wanyue Zhou | Chao Yang | Shihan Wang | Liang Pang
Proceedings of the 28th International Conference on Computational Linguistics

Aspect-based sentiment analysis (ABSA) aims to determine the sentiment polarity of each specific aspect in a given sentence. Existing researches have realized the importance of the aspect for the ABSA task and have derived many interactive learning methods that model context based on specific aspect. However, current interaction mechanisms are ill-equipped to learn complex sentences with multiple aspects, and these methods underestimate the representation learning of the aspect. In order to solve the two problems, we propose a mutual enhanced transformation network (METNet) for the ABSA task. First, the aspect enhancement module in METNet improves the representation learning of the aspect with contextual semantic features, which gives the aspect more abundant information. Second, METNet designs and implements a hierarchical structure, which enhances the representations of aspect and context iteratively. Experimental results on SemEval 2014 Datasets demonstrate the effectiveness of METNet, and we further prove that METNet is outstanding in multi-aspect scenarios.

pdf bib
PEDNet: A Persona Enhanced Dual Alternating Learning Network for Conversational Response Generation
Bin Jiang | Wanyue Zhou | Jingxu Yang | Chao Yang | Shihan Wang | Liang Pang
Proceedings of the 28th International Conference on Computational Linguistics

Endowing a chatbot with a personality is essential to deliver more realistic conversations. Various persona-based dialogue models have been proposed to generate personalized and diverse responses by utilizing predefined persona information. However, generating personalized responses is still a challenging task since the leverage of predefined persona information is often insufficient. To alleviate this problem, we propose a novel Persona Enhanced Dual Alternating Learning Network (PEDNet) aiming at producing more personalized responses in various open-domain conversation scenarios. PEDNet consists of a Context-Dominate Network (CDNet) and a Persona-Dominate Network (PDNet), which are built upon a common encoder-decoder backbone. CDNet learns to select a proper persona as well as ensure the contextual relevance of the predicted response, while PDNet learns to enhance the utilization of persona information when generating the response by weakening the disturbance of specific content in the conversation context. CDNet and PDNet are trained alternately using a multi-task training approach to equip PEDNet with the both capabilities they have learned. Both automatic and human evaluations on a newly released dialogue dataset Persona-chat demonstrate that our method could deliver more personalized responses than baseline methods.