Embedding Oriented Adaptable Semantic Annotation Framework for Amharic Web Documents

Kidane Woldemariyam, Dr. Fekade Getahun


Abstract
The Web has become a source of information, where information is provided by humans for humans and its growth has increased necessity to get solutions that intelligently extract valuable knowledge from existing and newly added web documents with no (minimal) supervisions. However, due to the unstructured nature of existing data on the Web, effective extraction of this knowledge is limited for both human beings and software agents. Thus, this research work designed generic and embedding oriented framework that automatically annotates semantically Amharic web documents using ontology. This framework significantly reduces manual annotation and learning cost used for semantic annotation of Amharic web documents with its nature of adaptability with minimal modification. The results have also implied that neural network techniques are promising for semantic annotation, especially for less resourced languages like Amharic in comparison to language dependent techniques that have cost of speed and challenge of adaptation into new domains and languages. We experiment the feasibility of the proposed approach using Amharic news collected from WALTA news agency and Amharic Wikipedia. Our results show that the proposed solution exhibits 70.68% of precision, 66.89% of recall and 68.53% of f-measure in semantic annotation for a morphologically complex Amharic language with limited size dataset.
Anthology ID:
2020.winlp-1.3
Volume:
Proceedings of the Fourth Widening Natural Language Processing Workshop
Month:
July
Year:
2020
Address:
Seattle, USA
Editors:
Rossana Cunha, Samira Shaikh, Erika Varis, Ryan Georgi, Alicia Tsai, Antonios Anastasopoulos, Khyathi Raghavi Chandu
Venue:
WiNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7
Language:
URL:
https://aclanthology.org/2020.winlp-1.3
DOI:
10.18653/v1/2020.winlp-1.3
Bibkey:
Cite (ACL):
Kidane Woldemariyam and Dr. Fekade Getahun. 2020. Embedding Oriented Adaptable Semantic Annotation Framework for Amharic Web Documents. In Proceedings of the Fourth Widening Natural Language Processing Workshop, page 7, Seattle, USA. Association for Computational Linguistics.
Cite (Informal):
Embedding Oriented Adaptable Semantic Annotation Framework for Amharic Web Documents (Woldemariyam & Getahun, WiNLP 2020)
Copy Citation:
Video:
 http://slideslive.com/38929539