Transparent Machine Learning for Information Extraction: State-of-the-Art and the Future

Laura Chiticariu, Yunyao Li, Frederick Reiss


Abstract
The rise of Big Data analytics over unstructured text has led to renewed interest in information extraction (IE). These applications need effective IE as a first step towards solving end-to-end real world problems (e.g. biology, medicine, finance, media and entertainment, etc). Much recent NLP research has focused on addressing specific IE problems using a pipeline of multiple machine learning techniques. This approach requires an analyst with the expertise to answer questions such as: “What ML techniques should I combine to solve this problem?”; “What features will be useful for the composite pipeline?”; and “Why is my model giving the wrong answer on this document?”. The need for this expertise creates problems in real world applications. It is very difficult in practice to find an analyst who both understands the real world problem and has deep knowledge of applied machine learning. As a result, the real impact by current IE research does not match up to the abundant opportunities available.In this tutorial, we introduce the concept of transparent machine learning. A transparent ML technique is one that:- produces models that a typical real world use can read and understand;- uses algorithms that a typical real world user can understand; and- allows a real world user to adapt models to new domains.The tutorial is aimed at IE researchers in both the academic and industry communities who are interested in developing and applying transparent ML.
Anthology ID:
D15-2003
Volume:
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Month:
September
Year:
2015
Address:
Lisbon, Portugal
Editors:
Wenjie Li, Khalil Sima'an
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/D15-2003
DOI:
Bibkey:
Cite (ACL):
Laura Chiticariu, Yunyao Li, and Frederick Reiss. 2015. Transparent Machine Learning for Information Extraction: State-of-the-Art and the Future. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, Lisbon, Portugal. Association for Computational Linguistics.
Cite (Informal):
Transparent Machine Learning for Information Extraction: State-of-the-Art and the Future (Chiticariu et al., EMNLP 2015)
Copy Citation: