BERT Has More to Offer: BERT Layers Combination Yields Better Sentence Embeddings

MohammadSaleh Hosseini, Munawara Munia, Latifur Khan


Abstract
Obtaining sentence representations from BERT-based models as feature extractors is invaluable as it takes much less time to pre-compute a one-time representation of the data and then use it for the downstream tasks, rather than fine-tune the whole BERT. Most previous works acquire a sentence’s representation by passing it to BERT and averaging its last layer. In this paper, we propose that the combination of certain layers of a BERT-based model rested on the data set and model can achieve substantially better results. We empirically show the effectiveness of our method for different BERT-based models on different tasks and data sets. Specifically, on seven standard semantic textual similarity data sets, we outperform the baseline BERT by improving the Spearman’s correlation by up to 25.75% and on average 16.32% without any further training. We also achieved state-of-the-art results on eight transfer data sets by reducing the relative error by up to 37.41% and on average 17.92%.
Anthology ID:
2023.findings-emnlp.1030
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15419–15431
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.1030
DOI:
10.18653/v1/2023.findings-emnlp.1030
Bibkey:
Cite (ACL):
MohammadSaleh Hosseini, Munawara Munia, and Latifur Khan. 2023. BERT Has More to Offer: BERT Layers Combination Yields Better Sentence Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 15419–15431, Singapore. Association for Computational Linguistics.
Cite (Informal):
BERT Has More to Offer: BERT Layers Combination Yields Better Sentence Embeddings (Hosseini et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.1030.pdf