Xiangji Huang

Also published as: Jimmy Xiangji Huang


2022

pdf bib
Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar | Enamul Hoque | Jimmy Xiangji Huang
Computational Linguistics, Volume 48, Issue 2 - June 2022

The Query-Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this article, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for the QFTS task while setting a new state-of-the-art result in several datasets across a set of automatic and human evaluation metrics.

2020

pdf bib
Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task
Md Tahmid Rahman Laskar | Jimmy Xiangji Huang | Enamul Hoque
Proceedings of the Twelfth Language Resources and Evaluation Conference

Word embeddings that consider context have attracted great attention for various natural language processing tasks in recent years. In this paper, we utilize contextualized word embeddings with the transformer encoder for sentence similarity modeling in the answer selection task. We present two different approaches (feature-based and fine-tuning-based) for answer selection. In the feature-based approach, we utilize two types of contextualized embeddings, namely the Embeddings from Language Models (ELMo) and the Bidirectional Encoder Representations from Transformers (BERT) and integrate each of them with the transformer encoder. We find that integrating these contextual embeddings with the transformer encoder is effective to improve the performance of sentence similarity modeling. In the second approach, we fine-tune two pre-trained transformer encoder models for the answer selection task. Based on our experiments on six datasets, we find that the fine-tuning approach outperforms the feature-based approach on all of them. Among our fine-tuning-based models, the Robustly Optimized BERT Pretraining Approach (RoBERTa) model results in new state-of-the-art performance across five datasets.

pdf bib
ReInceptionE: Relation-Aware Inception Network with Joint Local-Global Structural Information for Knowledge Graph Embedding
Zhiwen Xie | Guangyou Zhou | Jin Liu | Jimmy Xiangji Huang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The goal of Knowledge graph embedding (KGE) is to learn how to represent the low dimensional vectors for entities and relations based on the observed triples. The conventional shallow models are limited to their expressiveness. ConvE (Dettmers et al., 2018) takes advantage of CNN and improves the expressive power with parameter efficient operators by increasing the interactions between head and relation embeddings. However, there is no structural information in the embedding space of ConvE, and the performance is still limited by the number of interactions. The recent KBGAT (Nathani et al., 2019) provides another way to learn embeddings by adaptively utilizing structural information. In this paper, we take the benefits of ConvE and KBGAT together and propose a Relation-aware Inception network with joint local-global structural information for knowledge graph Embedding (ReInceptionE). Specifically, we first explore the Inception network to learn query embedding, which aims to further increase the interactions between head and relation embeddings. Then, we propose to use a relation-aware attention mechanism to enrich the query embedding with the local neighborhood and global entity information. Experimental results on both WN18RR and FB15k-237 datasets demonstrate that ReInceptionE achieves competitive performance compared with state-of-the-art methods.

pdf bib
WSL-DS: Weakly Supervised Learning with Distant Supervision for Query Focused Multi-Document Abstractive Summarization
Md Tahmid Rahman Laskar | Enamul Hoque | Jimmy Xiangji Huang
Proceedings of the 28th International Conference on Computational Linguistics

In the Query Focused Multi-Document Summarization (QF-MDS) task, a set of documents and a query are given where the goal is to generate a summary from these documents based on the given query. However, one major challenge for this task is the lack of availability of labeled training datasets. To overcome this issue, in this paper, we propose a novel weakly supervised learning approach via utilizing distant supervision. In particular, we use datasets similar to the target dataset as the training data where we leverage pre-trained sentence similarity models to generate the weak reference summary of each individual document in a document set from the multi-document gold reference summaries. Then, we iteratively train our summarization model on each single-document to alleviate the computational complexity issue that occurs while training neural summarization models in multiple documents (i.e., long sequences) at once. Experimental results on the Document Understanding Conferences (DUC) datasets show that our proposed approach sets a new state-of-the-art result in terms of various evaluation metrics.

pdf bib
A Contextual Alignment Enhanced Cross Graph Attention Network for Cross-lingual Entity Alignment
Zhiwen Xie | Runjie Zhu | Kunsong Zhao | Jin Liu | Guangyou Zhou | Jimmy Xiangji Huang
Proceedings of the 28th International Conference on Computational Linguistics

Cross-lingual entity alignment, which aims to match equivalent entities in KGs with different languages, has attracted considerable focus in recent years. Recently, many graph neural network (GNN) based methods are proposed for entity alignment and obtain promising results. However, existing GNN-based methods consider the two KGs independently and learn embeddings for different KGs separately, which ignore the useful pre-aligned links between two KGs. In this paper, we propose a novel Contextual Alignment Enhanced Cross Graph Attention Network (CAECGAT) for the task of cross-lingual entity alignment, which is able to jointly learn the embeddings in different KGs by propagating cross-KG information through pre-aligned seed alignments. We conduct extensive experiments on three benchmark cross-lingual entity alignment datasets. The experimental results demonstrate that our proposed method obtains remarkable performance gains compared to state-of-the-art methods.

2016

pdf bib
Bi-Transferring Deep Neural Networks for Domain Adaptation
Guangyou Zhou | Zhiwen Xie | Jimmy Xiangji Huang | Tingting He
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2003

pdf bib
Text Classification in Asian Languages without Word Segmentation
Fuchun Peng | Xiangji Huang | Dale Schuurmans | Shaojun Wang
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages

2002

pdf bib
Investigating the Relationship between Word Segmentation Performance and Retrieval Performance in Chinese IR
Fuchun Peng | Xiangji Huang | Dale Schuurmans | Nick Cercone
COLING 2002: The 19th International Conference on Computational Linguistics