Jianbin Qin


2025

pdf bib
MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language
Muhammad Asif Ali | Nawal Daftardar | Mutayyba Waheed | Jianbin Qin | Di Wang
Proceedings of the 31st International Conference on Computational Linguistics

Large Language Models (LLMs) have demonstrated significant capabilities across numerous application domains. A key challenge is to keep these models updated with latest available information, which limits the true potential of these models for the end-applications. Although, there have been numerous attempts for LLMs’ Knowledge Editing (KE), i.e., to update and/or edit the LLMs’ prior knowledge and in turn test it via Multi-hop Question Answering (MQA), yet so far these studies are primarily focused and/or developed for English language. To bridge this gap, in this paper we propose: Multi-hop Questioning Answering under Knowledge Editing for Arabic Language (MQA-KEAL). MQA-KEAL stores knowledge edits as structured knowledge units in the external memory. In order to solve multi-hop question, it first uses task-decomposition to decompose the question into smaller sub-problems. Later for each sub-problem, it iteratively queries the external memory and/or target LLM in order to generate the final response. In addition, we also contribute MQUAKE-AR (Arabic translation of English benchmark MQUAKE), as well as a new benchmark MQA-AEVAL for rigorous performance evaluation of MQA under KE for Arabic language. Experimentation evaluation reveals MQA-KEAL outperforms the baseline models by a significant margin. We release the codes for MQA-KEAL at https: //github.com/asif6827/MQA-Keal.

2024

pdf bib
Antonym vs Synonym Distinction using InterlaCed Encoder NETworks (ICE-NET)
Muhammad Ali | Yan Hu | Jianbin Qin | Di Wang
Findings of the Association for Computational Linguistics: EACL 2024

Antonyms vs synonyms distinction is a core challenge in lexico-semantic analysis and automated lexical resource construction. These pairs share a similar distributional context which makes it harder to distinguish them. Leading research in this regard attempts to capture the properties of the relation pairs, i.e., symmetry, transitivity, and trans-transitivity. However, the inability of existing research to appropriately model the relation-specific properties limits their end performance. In this paper, we propose InterlaCed Encoder NETworks (i.e., ICE-NET) for antonym vs synonym distinction, that aim to capture and model the relation-specific properties of the antonyms and synonyms pairs in order to perform the classification task in a performance-enhanced manner. Experimental evaluation using the benchmark datasets shows that ICE-NET outperforms the existing research by a relative score of upto 1.8% in F1-measure.

2023

pdf bib
GARI: Graph Attention for Relative Isomorphism of Arabic Word Embeddings
Muhammad Ali | Maha Alshmrani | Jianbin Qin | Yan Hu | Di Wang
Proceedings of ArabicNLP 2023

Bilingual Lexical Induction (BLI) is a core challenge in NLP, it relies on the relative isomorphism of individual embedding spaces. Existing attempts aimed at controlling the relative isomorphism of different embedding spaces fail to incorporate the impact of semantically related words in the model training objective. To address this, we propose GARI that combines the distributional training objectives with multiple isomorphism losses guided by the graph attention network. GARI considers the impact of semantical variations of words in order to define the relative isomorphism of the embedding spaces. Experimental evaluation using the Arabic language data set shows that GARI outperforms the existing research by improving the average P@1 by a relative score of up to 40.95% and 76.80% for in-domain and domain mismatch settings respectively.

pdf bib
GRI: Graph-based Relative Isomorphism of Word Embedding Spaces
Muhammad Ali | Yan Hu | Jianbin Qin | Di Wang
Findings of the Association for Computational Linguistics: EMNLP 2023

Automated construction of bi-lingual dictionaries using monolingual embedding spaces is a core challenge in machine translation. The end performance of these dictionaries relies on the geometric similarity of individual spaces, i.e., their degree of isomorphism. Existing attempts aimed at controlling the relative isomorphism of different spaces fail to incorporate the impact of lexically different but semantically related words in the training objective. To address this, we propose GRI that combines the distributional training objectives with attentive graph convolutions to unanimously consider the impact of lexical variations of semantically similar words required to define/compute the relative isomorphism of multiple spaces. Exper imental evaluation shows that GRI outperforms the existing research by improving the average P@1 by a relative score of upto 63.6%.