Yan Wan


2016

pdf bib
Real-Time Speech Emotion and Sentiment Recognition for Interactive Dialogue Systems
Dario Bertero | Farhad Bin Siddique | Chien-Sheng Wu | Yan Wan | Ricky Ho Yin Chan | Pascale Fung
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Machine Learning based Music Retrieval and Recommendation System
Naziba Mostafa | Yan Wan | Unnayan Amitabh | Pascale Fung
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we present a music retrieval and recommendation system using machine learning techniques. We propose a query by humming system for music retrieval that uses deep neural networks for note transcription and a note-based retrieval system for retrieving the correct song from the database. We evaluate our query by humming system using the standard MIREX QBSH dataset. We also propose a similar artist recommendation system which recommends similar artists based on acoustic features of the artists’ music, online text descriptions of the artists and social media data. We use supervised machine learning techniques over all our features and compare our recommendation results to those produced by a popular similar artist recommendation website.

pdf bib
Zara The Supergirl: An Empathetic Personality Recognition System
Pascale Fung | Anik Dey | Farhad Bin Siddique | Ruixi Lin | Yang Yang | Yan Wan | Ho Yin Ricky Chan
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
Zara: A Virtual Interactive Dialogue System Incorporating Emotion, Sentiment and Personality Recognition
Pascale Fung | Anik Dey | Farhad Bin Siddique | Ruixi Lin | Yang Yang | Dario Bertero | Yan Wan | Ricky Ho Yin Chan | Chien-Sheng Wu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Zara, or ‘Zara the Supergirl’ is a virtual robot, that can exhibit empathy while interacting with an user, with the aid of its built in facial and emotion recognition, sentiment analysis, and speech module. At the end of the 5-10 minute conversation, Zara can give a personality analysis of the user based on all the user utterances. We have also implemented a real-time emotion recognition, using a CNN model that detects emotion from raw audio without feature extraction, and have achieved an average of 65.7% accuracy on six different emotion classes, which is an impressive 4.5% improvement from the conventional feature based SVM classification. Also, we have described a CNN based sentiment analysis module trained using out-of-domain data, that recognizes sentiment from the speech recognition transcript, which has a 74.8 F-measure when tested on human-machine dialogues.