Speeding Up Transformer Training By Using Dataset Subsampling - An Exploratory Analysis

Lovre Torbarina, Velimir Mihelčić, Bruno Šarlija, Lukasz Roguski, Željko Kraljević


Abstract
Transformer-based models have greatly advanced the progress in the field of the natural language processing and while they achieve state-of-the-art results on a wide range of tasks, they are cumbersome in parameter size. Subsequently, even when pre-trained transformer models are used for fine-tuning on a given task, if the dataset is large, it may still not be feasible to fine-tune the model within a reasonable time. For this reason, we empirically test 8 subsampling methods for reducing the dataset size on text classification task and report the trade-off between metric score and training time. 7 out of 8 methods are simple methods, while the last one is CRAIG, a method for coreset construction for data-efficient model training. We obtain the best result with the CRAIG method, offering an average decrease of 0.03 points in f-score on test set while speeding up the training time on average by 63.93%, relative to the score and time obtained by using the full dataset. Lastly, we show the trade-off between speed and performance for all sampling methods on three different datasets.
Anthology ID:
2021.sustainlp-1.11
Volume:
Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing
Month:
November
Year:
2021
Address:
Virtual
Editors:
Nafise Sadat Moosavi, Iryna Gurevych, Angela Fan, Thomas Wolf, Yufang Hou, Ana Marasović, Sujith Ravi
Venue:
sustainlp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
86–95
Language:
URL:
https://aclanthology.org/2021.sustainlp-1.11
DOI:
10.18653/v1/2021.sustainlp-1.11
Bibkey:
Cite (ACL):
Lovre Torbarina, Velimir Mihelčić, Bruno Šarlija, Lukasz Roguski, and Željko Kraljević. 2021. Speeding Up Transformer Training By Using Dataset Subsampling - An Exploratory Analysis. In Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing, pages 86–95, Virtual. Association for Computational Linguistics.
Cite (Informal):
Speeding Up Transformer Training By Using Dataset Subsampling - An Exploratory Analysis (Torbarina et al., sustainlp 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.sustainlp-1.11.pdf
Software:
 2021.sustainlp-1.11.Software.zip
Video:
 https://aclanthology.org/2021.sustainlp-1.11.mp4