Stefan Geißler


2020

pdf bib
The Kairntech Sherpa – An ML Platform and API for the Enrichment of (not only) Scientific Content
Stefan Geißler
Proceedings of the 1st International Workshop on Language Technology Platforms

We present an software platform and API that combines various ML and NLP approaches for the analysis and enrichment of textual content. The platform’s design and implementation is guided by the goal to allow non-technical users to conduct their own experiments and training runs on their respective data, allowing to test, tune and deploy analysis models for production. Dedicated specific packages for subtasks such as document structure processing, document categorization, annotation with existing thesauri, disambiguation and linking, annotation with newly created entity recognizers and summarization – available as open source components in isolation – are combined into an end-user-facing, collaborative, scalable platform to support large-scale industrial document analysis document analysis. We see the Sherpa’s setup as an answer to the observation that ML has reached a level of maturity that allows to attain useful results in many analysis scenarios today, but that in-depth technical competencies in the required fields of NLP and AI is often scarce; a setup that focusses on non-technical domain-expert end-users can help to bring required analysis functionalities closer to the day-to-day reality in business contexts.
Search
Co-authors
    Venues