Chinese NER Using Lattice LSTM

Yue Zhang, Jie Yang


Abstract
We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.
Anthology ID:
P18-1144
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1554–1564
Language:
URL:
https://aclanthology.org/P18-1144
DOI:
10.18653/v1/P18-1144
Bibkey:
Cite (ACL):
Yue Zhang and Jie Yang. 2018. Chinese NER Using Lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1554–1564, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Chinese NER Using Lattice LSTM (Zhang & Yang, ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-1144.pdf
Poster:
 P18-1144.Poster.pdf
Code
 jiesutd/LatticeLSTM +  additional community code
Data
Resume NERMSRA CN NEROntoNotes 4.0Weibo NER