Fangxiang Feng


pdf bib
Co-VQA : Answering by Interactive Sub Question Sequence
Ruonan Wang | Yuxi Qian | Fangxiang Feng | Xiaojie Wang | Huixing Jiang
Findings of the Association for Computational Linguistics: ACL 2022

Most existing approaches to Visual Question Answering (VQA) answer questions directly, however, people usually decompose a complex question into a sequence of simple sub questions and finally obtain the answer to the original question after answering the sub question sequence(SQS). By simulating the process, this paper proposes a conversation-based VQA (Co-VQA) framework, which consists of three components: Questioner, Oracle, and Answerer. Questioner raises the sub questions using an extending HRED model, and Oracle answers them one-by-one. An Adaptive Chain Visual Reasoning Model (ACVRM) for Answerer is also proposed, where the question-answer pair is used to update the visual representation sequentially. To perform supervised learning for each model, we introduce a well-designed method to build a SQS for each question on VQA 2.0 and VQA-CP v2 datasets. Experimental results show that our method achieves state-of-the-art on VQA-CP v2. Further analyses show that SQSs help build direct semantic connections between questions and images, provide question-adaptive variable-length reasoning chains, and with explicit interpretability as well as error traceability.

pdf bib
A Simple Model for Distantly Supervised Relation Extraction
Ziqin Rao | Fangxiang Feng | Ruifan Li | Xiaojie Wang
Proceedings of the 29th International Conference on Computational Linguistics

Distantly supervised relation extraction is challenging due to the noise within data. Recent methods focus on exploiting bag representations based on deep neural networks with complex de-noising scheme to achieve remarkable performance. In this paper, we propose a simple but effective BERT-based Graph convolutional network Model (i.e., BGM). Our BGM comprises of an instance embedding module and a bag representation module. The instance embedding module uses a BERT-based pretrained language model to extract key information from each instance. The bag representaion module constructs the corresponding bag graph then apply a convolutional operation to obtain the bag representation. Our BGM model achieves a considerable improvement on two benchmark datasets, i.e., NYT10 and GDS.

pdf bib
Enhanced Multi-Channel Graph Convolutional Network for Aspect Sentiment Triplet Extraction
Hao Chen | Zepeng Zhai | Fangxiang Feng | Ruifan Li | Xiaojie Wang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Aspect Sentiment Triplet Extraction (ASTE) is an emerging sentiment analysis task. Most of the existing studies focus on devising a new tagging scheme that enables the model to extract the sentiment triplets in an end-to-end fashion. However, these methods ignore the relations between words for ASTE task. In this paper, we propose an Enhanced Multi-Channel Graph Convolutional Network model (EMC-GCN) to fully utilize the relations between words. Specifically, we first define ten types of relations for ASTE task, and then adopt a biaffine attention module to embed these relations as an adjacent tensor between words in a sentence. After that, our EMC-GCN transforms the sentence into a multi-channel graph by treating words and the relation adjacent tensor as nodes and edges, respectively. Thus, relation-aware node representations can be learnt. Furthermore, we consider diverse linguistic features to enhance our EMC-GCN model. Finally, we design an effective refining strategy on EMC-GCN for word-pair representation refinement, which considers the implicit results of aspect and opinion extraction when determining whether word pairs match or not. Extensive experimental results on the benchmark datasets demonstrate that the effectiveness and robustness of our proposed model, which outperforms state-of-the-art methods significantly.


pdf bib
Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Tongtong Liu | Fangxiang Feng | Xiaojie Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Multimodal pre-training models, such as LXMERT, have achieved excellent results in downstream tasks. However, current pre-trained models require large amounts of training data and have huge model sizes, which make them impossible to apply in low-resource situations. How to obtain similar or even better performance than a larger model under the premise of less pre-training data and smaller model size has become an important problem. In this paper, we propose a new Multi-stage Pre-training (MSP) method, which uses information at different granularities from word, phrase to sentence in both texts and images to pre-train a model in stages. We also design several different pre-training tasks suitable for the information granularity in different stage in order to efficiently capture the diverse knowledge from a limited corpus. We take a Simplified LXMERT (LXMERT-S) which is with 45.9% parameters of the original LXMERT model and only 11.44% of the original pre-training data as the testbed of our MSP method. Experimental results show that our method achieves comparable performance to the original LXMERT model in all downstream tasks, and even outperforms the original model in Image-Text Retrieval task.

pdf bib
Dual Graph Convolutional Networks for Aspect-based Sentiment Analysis
Ruifan Li | Hao Chen | Fangxiang Feng | Zhanyu Ma | Xiaojie Wang | Eduard Hovy
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Aspect-based sentiment analysis is a fine-grained sentiment classification task. Recently, graph neural networks over dependency trees have been explored to explicitly model connections between aspects and opinion words. However, the improvement is limited due to the inaccuracy of the dependency parsing results and the informal expressions and complexity of online reviews. To overcome these challenges, in this paper, we propose a dual graph convolutional networks (DualGCN) model that considers the complementarity of syntax structures and semantic correlations simultaneously. Particularly, to alleviate dependency parsing errors, we design a SynGCN module with rich syntactic knowledge. To capture semantic correlations, we design a SemGCN module with self-attention mechanism. Furthermore, we propose orthogonal and differential regularizers to capture semantic correlations between words precisely by constraining attention scores in the SemGCN module. The orthogonal regularizer encourages the SemGCN to learn semantically correlated words with less overlap for each word. The differential regularizer encourages the SemGCN to learn semantic features that the SynGCN fails to capture. Experimental results on three public datasets show that our DualGCN model outperforms state-of-the-art methods and verify the effectiveness of our model.