Alexey Tarasov


pdf bib
Towards Reversal-Based Textual Data Augmentation for NLI Problems with Opposable Classes
Alexey Tarasov
Proceedings of the First Workshop on Natural Language Interfaces

Data augmentation methods are commonly used in computer vision and speech. However, in domains dealing with textual data, such techniques are not that common. Most of the existing methods rely on rephrasing, i.e. new sentences are generated by changing a source sentence, preserving its meaning. We argue that in tasks with opposable classes (such as Positive and Negative in sentiment analysis), it might be beneficial to also invert the source sentence, reversing its meaning, to generate examples of the opposing class. Methods that use somewhat similar intuition exist in the space of adversarial learning, but are not always applicable to text classification (in our experiments, some of them were even detrimental to the resulting classifier accuracy). We propose and evaluate two reversal-based methods on an NLI task of recognising a type of a simple logical expression from its description in plain-text form. After gathering a dataset on MTurk, we show that a simple heuristic using a notion of negating the main verb has a potential not only on its own, but that it can also boost existing state-of-the-art rephrasing-based approaches.