How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks

Hoang Van, Zheng Tang, Mihai Surdeanu


Abstract
The general goal of text simplification (TS) is to reduce text complexity for human consumption. In this paper, we investigate another potential use of neural TS: assisting machines performing natural language processing (NLP) tasks. We evaluate the use of neural TS in two ways: simplifying input texts at prediction time and augmenting data to provide machines with additional information during training. We demonstrate that the latter scenario provides positive effects on machine performance on two separate datasets. In particular, the latter use of TS improves the performances of LSTM (1.82–1.98%) and SpanBERT (0.7–1.3%) extractors on TACRED, a complex, large-scale, real-world relation extraction task. Further, the same setting yields improvements of up to 0.65% matched and 0.62% mismatched accuracies for a BERT text classifier on MNLI, a practical natural language inference dataset.
Anthology ID:
2021.findings-emnlp.343
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4074–4080
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.343
DOI:
10.18653/v1/2021.findings-emnlp.343
Bibkey:
Cite (ACL):
Hoang Van, Zheng Tang, and Mihai Surdeanu. 2021. How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4074–4080, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks (Van et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.343.pdf
Code
 vanh17/textsim
Data
MultiNLI