NeuReduce: Reducing Mixed Boolean-Arithmetic Expressions by Recurrent Neural Network

Weijie Feng, Binbin Liu, Dongpeng Xu, Qilong Zheng, Yun Xu


Abstract
Mixed Boolean-Arithmetic (MBA) expressions involve both arithmetic calculation (e.g.,plus, minus, multiply) and bitwise computation (e.g., and, or, negate, xor). MBA expressions have been widely applied in software obfuscation, transforming programs from a simple form to a complex form. MBA expressions are challenging to be simplified, because the interleaving bitwise and arithmetic operations causing mathematical reduction laws to be ineffective. Our goal is to recover the original, simple form from an obfuscated MBA expression. In this paper, we first propose NeuReduce, a string to string method based on neural networks to automatically learn and reduce complex MBA expressions. We develop a comprehensive MBA dataset, including one million diversified MBA expression samples and corresponding simplified forms. After training on the dataset, NeuReduce can reduce MBA rules to homelier but mathematically equivalent forms. By comparing with three state-of-the-art MBA reduction methods, our evaluation result shows that NeuReduce outperforms all other tools in terms of accuracy, solving time, and performance overhead.
Anthology ID:
2020.findings-emnlp.56
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
635–644
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.56
DOI:
10.18653/v1/2020.findings-emnlp.56
Bibkey:
Cite (ACL):
Weijie Feng, Binbin Liu, Dongpeng Xu, Qilong Zheng, and Yun Xu. 2020. NeuReduce: Reducing Mixed Boolean-Arithmetic Expressions by Recurrent Neural Network. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 635–644, Online. Association for Computational Linguistics.
Cite (Informal):
NeuReduce: Reducing Mixed Boolean-Arithmetic Expressions by Recurrent Neural Network (Feng et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.56.pdf