Transfer Learning in Natural Language Processing

Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf


Abstract
The classic supervised machine learning paradigm is based on learning in isolation, a single predictive model for a task using a single dataset. This approach requires a large number of training examples and performs best for well-defined and narrow tasks. Transfer learning refers to a set of methods that extend this approach by leveraging data from additional domains or tasks to train a model with better generalization properties. Over the last two years, the field of Natural Language Processing (NLP) has witnessed the emergence of several transfer learning methods and architectures which significantly improved upon the state-of-the-art on a wide range of NLP tasks. These improvements together with the wide availability and ease of integration of these methods are reminiscent of the factors that led to the success of pretrained word embeddings and ImageNet pretraining in computer vision, and indicate that these methods will likely become a common tool in the NLP landscape as well as an important research direction. We will present an overview of modern transfer learning methods in NLP, how models are pre-trained, what information the representations they learn capture, and review examples and case studies on how these models can be integrated and adapted in downstream NLP tasks.
Anthology ID:
N19-5004
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Anoop Sarkar, Michael Strube
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15–18
Language:
URL:
https://aclanthology.org/N19-5004
DOI:
10.18653/v1/N19-5004
Bibkey:
Cite (ACL):
Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, and Thomas Wolf. 2019. Transfer Learning in Natural Language Processing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, pages 15–18, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Transfer Learning in Natural Language Processing (Ruder et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-5004.pdf
Presentation:
 N19-5004.Presentation.pdf
Video:
 https://vimeo.com/359399507