Probing Biomedical Embeddings from Language Models

Qiao Jin, Bhuwan Dhingra, William Cohen, Xinghua Lu


Abstract
Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance. In this paper, we conduct probing experiments to determine what additional information is carried intrinsically by the in-domain trained contextualized embeddings. For this we use the pre-trained LMs as fixed feature extractors and restrict the downstream task models to not have additional sequence modeling layers. We compare BERT (Devlin et al. 2018), ELMo (Peters et al., 2018), BioBERT (Lee et al., 2019) and BioELMo, a biomedical version of ELMo trained on 10M PubMed abstracts. Surprisingly, while fine-tuned BioBERT is better than BioELMo in biomedical NER and NLI tasks, as a fixed feature extractor BioELMo outperforms BioBERT in our probing tasks. We use visualization and nearest neighbor analysis to show that better encoding of entity-type and relational information leads to this superiority.
Anthology ID:
W19-2011
Volume:
Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP
Month:
June
Year:
2019
Address:
Minneapolis, USA
Venues:
NAACL | RepEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
82–89
Language:
URL:
https://aclanthology.org/W19-2011
DOI:
10.18653/v1/W19-2011
Bibkey:
Cite (ACL):
Qiao Jin, Bhuwan Dhingra, William Cohen, and Xinghua Lu. 2019. Probing Biomedical Embeddings from Language Models. In Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP, pages 82–89, Minneapolis, USA. Association for Computational Linguistics.
Cite (Informal):
Probing Biomedical Embeddings from Language Models (Jin et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-2011.pdf
Code
 Andy-jqa/bioelmo
Data
CoNLL-2003SNLI