Kathaa : NLP Systems as Edge-Labeled Directed Acyclic MultiGraphs

Sharada Mohanty, Nehal J Wani, Manish Srivastava, Dipti Sharma


Abstract
We present Kathaa, an Open Source web-based Visual Programming Framework for Natural Language Processing (NLP) Systems. Kathaa supports the design, execution and analysis of complex NLP systems by visually connecting NLP components from an easily extensible Module Library. It models NLP systems an edge-labeled Directed Acyclic MultiGraph, and lets the user use publicly co-created modules in their own NLP applications irrespective of their technical proficiency in Natural Language Processing. Kathaa exposes an intuitive web based Interface for the users to interact with and modify complex NLP Systems; and a precise Module definition API to allow easy integration of new state of the art NLP components. Kathaa enables researchers to publish their services in a standardized format to enable the masses to use their services out of the box. The vision of this work is to pave the way for a system like Kathaa, to be the Lego blocks of NLP Research and Applications. As a practical use case we use Kathaa to visually implement the Sampark Hindi-Panjabi Machine Translation Pipeline and the Sampark Hindi-Urdu Machine Translation Pipeline, to demonstrate the fact that Kathaa can handle really complex NLP systems while still being intuitive for the end user.
Anthology ID:
W16-5201
Volume:
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yohei Murakami, Donghui Lin, Nancy Ide, James Pustejovsky
Venue:
OIAF4HLT
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/W16-5201
DOI:
Bibkey:
Cite (ACL):
Sharada Mohanty, Nehal J Wani, Manish Srivastava, and Dipti Sharma. 2016. Kathaa : NLP Systems as Edge-Labeled Directed Acyclic MultiGraphs. In Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016), pages 1–10, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Kathaa : NLP Systems as Edge-Labeled Directed Acyclic MultiGraphs (Mohanty et al., OIAF4HLT 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-5201.pdf