Boosting Information Extraction Systems with Character-level Neural Networks and Free Noisy Supervision

Philipp Meerkamp, Zhengyi Zhou


Abstract
We present an architecture to boost the precision of existing information extraction systems. This is achieved by augmenting the existing parser, which may be constraint-based or hybrid statistical, with a character-level neural network. Our architecture combines the ability of constraint-based or hybrid extraction systems to easily incorporate domain knowledge with the ability of deep neural networks to leverage large amounts of data to learn complex features. The network is trained using a measure of consistency between extracted data and existing databases as a form of cheap, noisy supervision. Our architecture does not require large scale manual annotation or a system rewrite. It has led to large precision improvements over an existing, highly-tuned production information extraction system used at Bloomberg LP for financial language text.
Anthology ID:
W17-4307
Volume:
Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Kai-Wei Chang, Ming-Wei Chang, Vivek Srikumar, Alexander M. Rush
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
44–51
Language:
URL:
https://aclanthology.org/W17-4307
DOI:
10.18653/v1/W17-4307
Bibkey:
Cite (ACL):
Philipp Meerkamp and Zhengyi Zhou. 2017. Boosting Information Extraction Systems with Character-level Neural Networks and Free Noisy Supervision. In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, pages 44–51, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Boosting Information Extraction Systems with Character-level Neural Networks and Free Noisy Supervision (Meerkamp & Zhou, 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4307.pdf