Pretrained Transformers for Text Ranking: BERT and Beyond

Andrew Yates, Rodrigo Nogueira, Jimmy Lin


Abstract
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query for a particular task. Although the most common formulation of text ranking is search, instances of the task can also be found in many text processing applications. This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example. These models produce high quality results across many domains, tasks, and settings. This tutorial, which is based on the preprint of a forthcoming book to be published by Morgan and & Claypool under the Synthesis Lectures on Human Language Technologies series, provides an overview of existing work as a single point of entry for practitioners who wish to deploy transformers for text ranking in real-world applications and researchers who wish to pursue work in this area. We cover a wide range of techniques, grouped into two categories: transformer models that perform reranking in multi-stage ranking architectures and learned dense representations that perform ranking directly.
Anthology ID:
2021.naacl-tutorials.1
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials
Month:
June
Year:
2021
Address:
Online
Editors:
Greg Kondrak, Kalina Bontcheva, Dan Gillick
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–4
Language:
URL:
https://aclanthology.org/2021.naacl-tutorials.1
DOI:
10.18653/v1/2021.naacl-tutorials.1
Bibkey:
Cite (ACL):
Andrew Yates, Rodrigo Nogueira, and Jimmy Lin. 2021. Pretrained Transformers for Text Ranking: BERT and Beyond. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials, pages 1–4, Online. Association for Computational Linguistics.
Cite (Informal):
Pretrained Transformers for Text Ranking: BERT and Beyond (Yates et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-tutorials.1.pdf
Video:
 https://aclanthology.org/2021.naacl-tutorials.1.mp4
Data
ASNQBEIRC4CORD-19DL-HARDMS MARCONatural QuestionsSNLISQuADSSTSST-2TREC-COVIDTriviaQAWebQuestions