Hua Xu


2024

pdf bib
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances
Hanlei Zhang | Hua Xu | Fei Long | Xin Wang | Kai Gao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Discovering the semantics of multimodal utterances is essential for understanding human language and enhancing human-machine interactions. Existing methods manifest limitations in leveraging nonverbal information for discerning complex semantics in unsupervised scenarios. This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field. UMC introduces a unique approach to constructing augmentation views for multimodal data, which are then used to perform pre-training to establish well-initialized representations for subsequent clustering. An innovative strategy is proposed to dynamically select high-quality samples as guidance for representation learning, gauged by the density of each sample’s nearest neighbors. Besides, it is equipped to automatically determine the optimal value for the top-K parameter in each cluster to refine sample selection. Finally, both high- and low-quality samples are used to learn representations conducive to effective clustering. We build baselines on benchmark multimodal intent and dialogue act datasets. UMC shows remarkable improvements of 2-6% scores in clustering metrics over state-of-the-art methods, marking the first successful endeavor in this domain. The complete code and data are available at https://github.com/thuiar/UMC.

pdf bib
OpenVNA: A Framework for Analyzing the Behavior of Multimodal Language Understanding System under Noisy Scenarios
Ziqi Yuan | Baozheng Zhang | Hua Xu | Zhiyun Liang | Kai Gao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robustness, offering high flexibility and extensibility, thereby enabling customization with user-defined noise types and models. Additionally, a GUI-based interface has been developed to intuitively analyze local model behavior. In this paper, we delineate the design principles and utilization of the created library and GUI-based web platform. Currently, OpenVNA is publicly accessible at https://github.com/thuiar/OpenVNA, with a demonstration video available at https://youtu.be/0Z9cW7RGct4.

2022

pdf bib
Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues
Avisha Das | Salih Selek | Alia R. Warner | Xu Zuo | Yan Hu | Vipina Kuttichi Keloth | Jianfu Li | W. Jim Zheng | Hua Xu
Proceedings of the 21st Workshop on Biomedical Language Processing

Conversational bots have become non-traditional methods for therapy among individuals suffering from psychological illnesses. Leveraging deep neural generative language models, we propose a deep trainable neural conversational model for therapy-oriented response generation. We leverage transfer learning methods during training on therapy and counseling based data from Reddit and AlexanderStreet. This was done to adapt existing generative models – GPT2 and DialoGPT – to the task of automated dialog generation. Through quantitative evaluation of the linguistic quality, we observe that the dialog generation model - DialoGPT (345M) with transfer learning on video data attains scores similar to a human response baseline. However, human evaluation of responses by conversational bots show mostly signs of generic advice or information sharing instead of therapeutic interaction.

pdf bib
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
Huisheng Mao | Ziqi Yuan | Hua Xu | Wenmeng Yu | Yihe Liu | Kai Gao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

M-SENA is an open-sourced platform for Multimodal Sentiment Analysis. It aims to facilitate advanced research by providing flexible toolkits, reliable benchmarks, and intuitive demonstrations. The platform features a fully modular video sentiment analysis framework consisting of data management, feature extraction, model training, and result analysis modules. In this paper, we first illustrate the overall architecture of the M-SENA platform and then introduce features of the core modules. Reliable baseline results of different modality features and MSA benchmarks are also reported. Moreover, we use model evaluation and analysis tools provided by M-SENA to present intermediate representation visualization, on-the-fly instance test, and generalization ability test results. The source code of the platform is publicly available at https://github.com/thuiar/M-SENA.

pdf bib
Consistent Representation Learning for Continual Relation Extraction
Kang Zhao | Hua Xu | Jiangong Yang | Kai Gao
Findings of the Association for Computational Linguistics: ACL 2022

Continual relation extraction (CRE) aims to continuously train a model on data with new relations while avoiding forgetting old ones. Some previous work has proved that storing a few typical samples of old relations and replaying them when learning new relations can effectively avoid forgetting. However, these memory-based methods tend to overfit the memory samples and perform poorly on imbalanced datasets. To solve these challenges, a consistent representation learning method is proposed, which maintains the stability of the relation embedding by adopting contrastive learning and knowledge distillation when replaying memory. Specifically, supervised contrastive learning based on a memory bank is first used to train each new task so that the model can effectively learn the relation representation. Then, contrastive replay is conducted of the samples in memory and makes the model retain the knowledge of historical relations through memory knowledge distillation to prevent the catastrophic forgetting of the old task. The proposed method can better learn consistent representations to alleviate forgetting effectively. Extensive experiments on FewRel and TACRED datasets show that our method significantly outperforms state-of-the-art baselines and yield strong robustness on the imbalanced dataset.

pdf bib
Continual Machine Reading Comprehension via Uncertainty-aware Fixed Memory and Adversarial Domain Adaptation
Zhijing Wu | Hua Xu | Jingliang Fang | Kai Gao
Findings of the Association for Computational Linguistics: NAACL 2022

Continual Machine Reading Comprehension aims to incrementally learn from a continuous data stream across time without access the previous seen data, which is crucial for the development of real-world MRC systems. However, it is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MA-MRC, a continual MRC model with uncertainty-aware fixed Memory and Adversarial domain adaptation, is proposed. In MA-MRC, a fixed size memory stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MA-MRC not only keeps a stable understanding by learning both memory and new domain data, but also makes full use of the domain adaptation relationship between them by adversarial learning strategy. The experimental results show that MA-MRC is superior to strong baselines and has a substantial incremental learning ability without catastrophically forgetting under two different continual MRC settings.

2021

pdf bib
TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition
Hanlei Zhang | Xiaoteng Li | Hua Xu | Panpan Zhang | Kang Zhao | Kai Gao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

TEXTOIR is the first integrated and visualized platform for text open intent recognition. It is composed of two main modules: open intent detection and open intent discovery. Each module integrates most of the state-of-the-art algorithms and benchmark intent datasets. It also contains an overall framework connecting the two modules in a pipeline scheme. In addition, this platform has visualized tools for data and model management, training, evaluation and analysis of the performance from different aspects. TEXTOIR provides useful toolkits and convenient visualized interfaces for each sub-module, and designs a framework to implement a complete process to both identify known intents and discover open intents.

2020

pdf bib
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality
Wenmeng Yu | Hua Xu | Fanyang Meng | Yilin Zhu | Yixiao Ma | Jiele Wu | Jiyun Zou | Kaicheng Yang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Previous studies in multimodal sentiment analysis have used limited datasets, which only contain unified multimodal annotations. However, the unified annotations do not always reflect the independent sentiment of single modalities and limit the model to capture the difference between modalities. In this paper, we introduce a Chinese single- and multi-modal sentiment analysis dataset, CH-SIMS, which contains 2,281 refined video segments in the wild with both multimodal and independent unimodal annotations. It allows researchers to study the interaction between modalities or use independent unimodal annotations for unimodal sentiment analysis. Furthermore, we propose a multi-task learning framework based on late fusion as the baseline. Extensive experiments on the CH-SIMS show that our methods achieve state-of-the-art performance and learn more distinctive unimodal representations. The full dataset and codes are available for use at https://github.com/thuiar/MMSA.

2019

pdf bib
Deep Unknown Intent Detection with Margin Loss
Ting-En Lin | Hua Xu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Identifying the unknown (novel) user intents that have never appeared in the training set is a challenging task in the dialogue system. In this paper, we present a two-stage method for detecting unknown intents. We use bidirectional long short-term memory (BiLSTM) network with the margin loss as the feature extractor. With margin loss, we can learn discriminative deep features by forcing the network to maximize inter-class variance and to minimize intra-class variance. Then, we feed the feature vectors to the density-based novelty detection algorithm, local outlier factor (LOF), to detect unknown intents. Experiments on two benchmark datasets show that our method can yield consistent improvements compared with the baseline methods.

pdf bib
The Strength of the Weakest Supervision: Topic Classification Using Class Labels
Jiatong Li | Kai Zheng | Hua Xu | Qiaozhu Mei | Yue Wang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

When developing topic classifiers for real-world applications, we begin by defining a set of meaningful topic labels. Ideally, an intelligent classifier can understand these labels right away and start classifying documents. Indeed, a human can confidently tell if an article is about science, politics, sports, or none of the above, after knowing just the class labels. We study the problem of training an initial topic classifier using only class labels. We investigate existing techniques for solving this problem and propose a simple but effective approach. Experiments on a variety of topic classification data sets show that learning from class labels can save significant initial labeling effort, essentially providing a ”free” warm start to the topic classifier.

2016

pdf bib
UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes
Hee-Jin Lee | Hua Xu | Jingqi Wang | Yaoyun Zhang | Sungrim Moon | Jun Xu | Yonghui Wu
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14
Jun Xu | Yaoyun Zhang | Jingqi Wang | Yonghui Wu | Min Jiang | Ergin Soysal | Hua Xu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Clinical Abbreviation Disambiguation Using Neural Word Embeddings
Yonghui Wu | Jun Xu | Yaoyun Zhang | Hua Xu
Proceedings of BioNLP 15

2014

pdf bib
UTH_CCB: A report for SemEval 2014 – Task 7 Analysis of Clinical Text
Yaoyun Zhang | Jingqi Wang | Buzhou Tang | Yonghui Wu | Min Jiang | Yukun Chen | Hua Xu
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
Implicit Feature Detection via a Constrained Topic Model and SVM
Wei Wang | Hua Xu | Xiaoqiu Huang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Soochow University: Description and Analysis of the Chinese Word Sense Induction System for CLP2010
Hua Xu | Bing Liu | Longhua Qian | Guodong Zhou
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Grouping Product Features Using Semi-Supervised Learning with Soft-Constraints
Zhongwu Zhai | Bing Liu | Hua Xu | Peifa Jia
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine
Son Doan | Hua Xu
Coling 2010: Posters

2007

pdf bib
Combining multiple evidence for gene symbol disambiguation
Hua Xu | Jung-Wei Fan | Carol Friedman
Biological, translational, and clinical language processing