Pooled Contextualized Embeddings for Named Entity Recognition

Alan Akbik; Tanja Bergmann; Roland Vollgraf

doi:10.18653/v1/N19-1078

Pooled Contextualized Embeddings for Named Entity Recognition

Alan Akbik, Tanja Bergmann, Roland Vollgraf

Abstract

Contextual string embeddings are a recent type of contextualized word embedding that were shown to yield state-of-the-art results when utilized in a range of sequence labeling tasks. They are based on character-level language models which treat text as distributions over characters and are capable of generating embeddings for any string of characters within any textual context. However, such purely character-based approaches struggle to produce meaningful embeddings if a rare string is used in a underspecified context. To address this drawback, we propose a method in which we dynamically aggregate contextualized embeddings of each unique string that we encounter. We then use a pooling operation to distill a ”global” word representation from all contextualized instances. We evaluate these ”pooled contextualized embeddings” on common named entity recognition (NER) tasks such as CoNLL-03 and WNUT and show that our approach significantly improves the state-of-the-art for NER. We make all code and pre-trained models available to the research community for use and reproduction.

Anthology ID:: N19-1078
Volume:: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota
Editors:: Jill Burstein, Christy Doran, Thamar Solorio
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 724–728
Language:
URL:: https://aclanthology.org/N19-1078/
DOI:: 10.18653/v1/N19-1078
Bibkey:
Cite (ACL):: Alan Akbik, Tanja Bergmann, and Roland Vollgraf. 2019. Pooled Contextualized Embeddings for Named Entity Recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 724–728, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):: Pooled Contextualized Embeddings for Named Entity Recognition (Akbik et al., NAACL 2019)
Copy Citation:
PDF:: https://aclanthology.org/N19-1078.pdf
Data: WNUT 2017

PDF Cite Search Fix data