Victor Bellon


2017

pdf bib
One model per entity: using hundreds of machine learning models to recognize and normalize biomedical names in text
Victor Bellon | Raul Rodriguez-Esteban
Proceedings of the Biomedical NLP Workshop associated with RANLP 2017

We explored a new approach to named entity recognition based on hundreds of machine learning models, each trained to distinguish a single entity, and showed its application to gene name identification (GNI). The rationale for our approach, which we named “one model per entity” (OMPE), was that increasing the number of models would make the learning task easier for each individual model. Our training strategy leveraged freely-available database annotations instead of manually-annotated corpora. While its performance in our proof-of-concept was disappointing, we believe that there is enough room for improvement that such approaches could reach competitive performance while eliminating the cost of creating costly training corpora.