Xiangliang Zhang


2022

pdf bib
MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving
Zhenwen Liang | Jipeng Zhang | Lei Wang | Wei Qin | Yunshi Lan | Jie Shao | Xiangliang Zhang
Findings of the Association for Computational Linguistics: NAACL 2022

Math word problem (MWP) solving faces a dilemma in number representation learning. In order to avoid the number representation issue and reduce the search space of feasible solutions, existing works striving for MWP solving usually replace real numbers with symbolic placeholders to focus on logic reasoning. However, different from common symbolic reasoning tasks like program synthesis and knowledge graph reasoning, MWP solving has extra requirements in numerical reasoning. In other words, instead of the number value itself, it is the reusable numerical property that matters more in numerical reasoning. Therefore, we argue that injecting numerical properties into symbolic placeholders with contextualized representation learning schema can provide a way out of the dilemma in the number representation issue here. In this work, we introduce this idea to the popular pre-training language model (PLM) techniques and build MWP-BERT, an effective contextual number representation PLM. We demonstrate the effectiveness of our MWP-BERT on MWP solving and several MWP-specific understanding tasks on both English and Chinese benchmarks.

pdf bib
ArMATH: a Dataset for Solving Arabic Math Word Problems
Reem Alghamdi | Zhenwen Liang | Xiangliang Zhang
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper studies solving Arabic Math Word Problems by deep learning. A Math Word Problem (MWP) is a text description of a mathematical problem that can be solved by deriving a math equation to reach the answer. Effective models have been developed for solving MWPs in English and Chinese. However, Arabic MWPs are rarely studied. This paper contributes the first large-scale dataset for Arabic MWPs, which contains 6,000 samples of primary-school math problems, written in Modern Standard Arabic (MSA). Arabic MWP solvers are then built with deep learning models and evaluated on this dataset. In addition, a transfer learning model is built to let the high-resource Chinese MWP solver promote the performance of the low-resource Arabic MWP solver. This work is the first to use deep learning methods to solve Arabic MWP and the first to use transfer learning to solve MWP across different languages. The transfer learning enhanced solver has an accuracy of 74.15%, which is 3% higher than the solver without using transfer learning. We make the dataset and solvers available in public for encouraging more research of Arabic MWPs: https://github.com/reem-codes/ArMATH

pdf bib
Unsupervised Mitigating Gender Bias by Character Components: A Case Study of Chinese Word Embedding
Xiuying Chen | Mingzhe Li | Rui Yan | Xin Gao | Xiangliang Zhang
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Word embeddings learned from massive text collections have demonstrated significant levels of discriminative biases.However, debias on the Chinese language, one of the most spoken languages, has been less explored.Meanwhile, existing literature relies on manually created supplementary data, which is time- and energy-consuming.In this work, we propose the first Chinese Gender-neutral word Embedding model (CGE) based on Word2vec, which learns gender-neutral word embeddings without any labeled data.Concretely, CGE utilizes and emphasizes the rich feminine and masculine information contained in radicals, i.e., a kind of component in Chinese characters, during the training procedure.This consequently alleviates discriminative gender biases.Experimental results on public benchmark datasets show that our unsupervised method outperforms the state-of-the-art supervised debiased word embedding models without sacrificing the functionality of the embedding model.

2021

pdf bib
Data-Efficient Language Shaped Few-shot Image Classification
Zhenwen Liang | Xiangliang Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

Many existing works have demonstrated that language is a helpful guider for image understanding by neural networks. We focus on a language-shaped learning problem in a few-shot setting, i.e., using language to improve few-shot image classification when language descriptions are only available during training. We propose a data-efficient method that can make the best usage of the few-shot images and the language available only in training. Experimental results on dataset ShapeWorld and Birds show that our method outperforms other state-of-the-art baselines in language-shaped few-shot learning area, especially when training data is more severely limited. Therefore, we call our approach data-efficient language-shaped learning (DF-LSL).

pdf bib
Capturing Relations between Scientific Papers: An Abstractive Model for Related Work Section Generation
Xiuying Chen | Hind Alamro | Mingzhe Li | Shen Gao | Xiangliang Zhang | Dongyan Zhao | Rui Yan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Given a set of related publications, related work section generation aims to provide researchers with an overview of the specific research area by summarizing these works and introducing them in a logical order. Most of existing related work generation models follow the inflexible extractive style, which directly extract sentences from multiple original papers to form a related work discussion. Hence, in this paper, we propose a Relation-aware Related work Generator (RRG), which generates an abstractive related work from the given multiple scientific papers in the same research area. Concretely, we propose a relation-aware multi-document encoder that relates one document to another according to their content dependency in a relation graph. The relation graph and the document representation are interacted and polished iteratively, complementing each other in the training process. We also contribute two public datasets composed of related work sections and their corresponding papers. Extensive experiments on the two datasets show that the proposed model brings substantial improvements over several strong baselines. We hope that this work will promote advances in related work generation task.