Zhen Hai


2024

pdf bib
StructAM: Enhancing Address Matching through Semantic Understanding of Structure-aware Information
Zhaoqi Zhang | Pasquale Balsebre | Siqiang Luo | Zhen Hai | Jiangping Huang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The task of address matching involves linking unstructured addresses to standard ones in a database. The challenges presented by this task are manifold: misspellings, incomplete information, and variations in address content are some examples. While there have been previous studies on entity matching in natural language processing, for the address matching solution, existing approaches still rely on string-based similarity matching or manually-designed rules. In this paper, we propose StructAM, a novel method based on pre-trained language models (LMs) and graph neural networks to extract the textual and structured information of the addresses. The proposed method leverages the knowledge acquired by large language models during the pre-training phase, and refines it during the fine-tuning process on the address domain, to obtain address-specific semantic features. Meanwhile, it also applies an attribute attention mechanism based on Graph Sampling and Aggregation (GraphSAGE) module to capture internal hierarchy information of the address text. To further enhance the accuracy of our algorithm in dirty settings, we incorporate spatial coordinates and contextual information from the surrounding area as auxiliary guidance. We conduct extensive experiments on real-world datasets from four different countries and the results show that StructAM outperforms state-of-the-art baseline approaches for address matching.

2023

pdf bib
Gradient-Boosted Decision Tree for Listwise Context Model in Multimodal Review Helpfulness Prediction
Thong Nguyen | Xiaobao Wu | Xinshuai Dong | Cong-Duy Nguyen | Zhen Hai | Lidong Bing | Anh Tuan Luu
Findings of the Association for Computational Linguistics: ACL 2023

Multimodal Review Helpfulness Prediction (MRHP) aims to rank product reviews based on predicted helpfulness scores and has been widely applied in e-commerce via presenting customers with useful reviews. Previous studies commonly employ fully-connected neural networks (FCNNs) as the final score predictor and pairwise loss as the training objective. However, FCNNs have been shown to perform inefficient splitting for review features, making the model difficult to clearly differentiate helpful from unhelpful reviews. Furthermore, pairwise objective, which works on review pairs, may not completely capture the MRHP goal to produce the ranking for the entire review list, and possibly induces low generalization during testing. To address these issues, we propose a listwise attention network that clearly captures the MRHP ranking context and a listwise optimization objective that enhances model generalization. We further propose gradient-boosted decision tree as the score predictor to efficaciously partition product reviews’ representations. Extensive experiments demonstrate that our method achieves state-of-the-art results and polished generalization performance on two large-scale MRHP benchmark datasets.

2022

pdf bib
SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning
Wei Han | Hui Chen | Zhen Hai | Soujanya Poria | Lidong Bing
Proceedings of the 29th International Conference on Computational Linguistics

With the boom of e-commerce, Multimodal Review Helpfulness Prediction (MRHP) that identifies the helpfulness score of multimodal product reviews has become a research hotspot. Previous work on this task focuses on attention-based modality fusion, information integration, and relation modeling, which primarily exposes the following drawbacks: 1) the model may fail to capture the really essential information due to its indiscriminate attention formulation; 2) lack appropriate modeling methods that takes full advantage of correlation among provided data. In this paper, we propose SANCL: Selective Attention and Natural Contrastive Learning for MRHP. SANCL adopts a probe-based strategy to enforce high attention weights on the regions of greater significance. It also constructs a contrastive learning framework based on natural matching properties in the dataset. Experimental results on two benchmark datasets with three categories show that SANCL achieves state-of-the-art baseline performance with lower memory consumption.

pdf bib
Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Prediction
Thong Nguyen | Xiaobao Wu | Anh Tuan Luu | Zhen Hai | Lidong Bing
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Modern Review Helpfulness Prediction systems are dependent upon multiple modalities, typically texts and images. Unfortunately, those contemporary approaches pay scarce attention to polish representations of cross-modal relations and tend to suffer from inferior optimization. This might cause harm to model’s predictions in numerous cases. To overcome the aforementioned issues, we propose Multi-modal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations. In addition, we introduce Adaptive Weighting scheme for our contrastive learning approach in order to increase flexibility in optimization. Lastly, we propose Multimodal Interaction module to address the unalignment nature of multimodal data, thereby assisting the model in producing more reasonable multimodal representations. Experimental results show that our method outperforms prior baselines and achieves state-of-the-art results on two publicly available benchmark datasets for MRHP problem.

2021

pdf bib
Multi-perspective Coherent Reasoning for Helpfulness Prediction of Multimodal Reviews
Junhao Liu | Zhen Hai | Min Yang | Lidong Bing
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

As more and more product reviews are posted in both text and images, Multimodal Review Analysis (MRA) becomes an attractive research topic. Among the existing review analysis tasks, helpfulness prediction on review text has become predominant due to its importance for e-commerce platforms and online shops, i.e. helping customers quickly acquire useful product information. This paper proposes a new task Multimodal Review Helpfulness Prediction (MRHP) aiming to analyze the review helpfulness from text and visual modalities. Meanwhile, a novel Multi-perspective Coherent Reasoning method (MCR) is proposed to solve the MRHP task, which conducts joint reasoning over texts and images from both the product and the review, and aggregates the signals to predict the review helpfulness. Concretely, we first propose a product-review coherent reasoning module to measure the intra- and inter-modal coherence between the target product and the review. In addition, we also devise an intra-review coherent reasoning module to identify the coherence between the text content and images of the review, which is a piece of strong evidence for review helpfulness prediction. To evaluate the effectiveness of MCR, we present two newly collected multimodal review datasets as benchmark evaluation resources for the MRHP task. Experimental results show that our MCR method can lead to a performance increase of up to 8.5% as compared to the best performing text-only model. The source code and datasets can be obtained from https://github.com/jhliu17/MCR.

pdf bib
Jointly Identifying Rhetoric and Implicit Emotions via Multi-Task Learning
Xin Chen | Zhen Hai | Deyu Li | Suge Wang | Dian Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2016

pdf bib
Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data
Zhen Hai | Peilin Zhao | Peng Cheng | Peng Yang | Xiao-Li Li | Guangxia Li
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
A Statistical NLP Approach for Feature and Sentiment Identification from Chinese Reviews
Zhen Hai | Kuiyu Chang | Qinbao Song | Jung-jae Kim
CIPS-SIGHAN Joint Conference on Chinese Language Processing