Yun Xu


2021

pdf bib
GraphMR: Graph Neural Network for Mathematical Reasoning
Weijie Feng | Binbin Liu | Dongpeng Xu | Qilong Zheng | Yun Xu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Mathematical reasoning aims to infer satisfiable solutions based on the given mathematics questions. Previous natural language processing researches have proven the effectiveness of sequence-to-sequence (Seq2Seq) or related variants on mathematics solving. However, few works have been able to explore structural or syntactic information hidden in expressions (e.g., precedence and associativity). This dissertation set out to investigate the usefulness of such untapped information for neural architectures. Firstly, mathematical questions are represented in the format of graphs within syntax analysis. The structured nature of graphs allows them to represent relations of variables or operators while preserving the semantics of the expressions. Having transformed to the new representations, we proposed a graph-to-sequence neural network GraphMR, which can effectively learn the hierarchical information of graphs inputs to solve mathematics and speculate answers. A complete experimental scenario with four classes of mathematical tasks and three Seq2Seq baselines is built to conduct a comprehensive analysis, and results show that GraphMR outperforms others in hidden information learning and mathematics resolving.

2020

pdf bib
NeuReduce: Reducing Mixed Boolean-Arithmetic Expressions by Recurrent Neural Network
Weijie Feng | Binbin Liu | Dongpeng Xu | Qilong Zheng | Yun Xu
Findings of the Association for Computational Linguistics: EMNLP 2020

Mixed Boolean-Arithmetic (MBA) expressions involve both arithmetic calculation (e.g.,plus, minus, multiply) and bitwise computation (e.g., and, or, negate, xor). MBA expressions have been widely applied in software obfuscation, transforming programs from a simple form to a complex form. MBA expressions are challenging to be simplified, because the interleaving bitwise and arithmetic operations causing mathematical reduction laws to be ineffective. Our goal is to recover the original, simple form from an obfuscated MBA expression. In this paper, we first propose NeuReduce, a string to string method based on neural networks to automatically learn and reduce complex MBA expressions. We develop a comprehensive MBA dataset, including one million diversified MBA expression samples and corresponding simplified forms. After training on the dataset, NeuReduce can reduce MBA rules to homelier but mathematically equivalent forms. By comparing with three state-of-the-art MBA reduction methods, our evaluation result shows that NeuReduce outperforms all other tools in terms of accuracy, solving time, and performance overhead.