Question Answering Classification for Amharic Social Media Community Based Questions

Tadesse Destaw, Seid Muhie Yimam, Abinew Ayele, Chris Biemann


Abstract
In this work, we build a Question Answering (QA) classification dataset from a social media platform, namely the Telegram public channel called @AskAnythingEthiopia. The channel has more than 78k subscribers and has existed since May 31, 2019. The platform allows asking questions that belong to various domains, like politics, economics, health, education, and so on. Since the questions are posed in a mixed-code, we apply different strategies to pre-process the dataset. Questions are posted in Amharic, English, or Amharic but in a Latin script. As part of the pre-processing tools, we build a Latin to Ethiopic Script transliteration tool. We collect 8k Amharic and 24K transliterated questions and develop deep learning-based questions answering classifiers that attain as high as an F-score of 57.29 in 20 different question classes or categories. The datasets and pre-processing scripts are open-sourced to facilitate further research on the Amharic community-based question answering.
Anthology ID:
2022.sigul-1.18
Volume:
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
SIGUL
SIG:
SIGUL
Publisher:
European Language Resources Association
Note:
Pages:
137–145
Language:
URL:
https://aclanthology.org/2022.sigul-1.18
DOI:
Bibkey:
Cite (ACL):
Tadesse Destaw, Seid Muhie Yimam, Abinew Ayele, and Chris Biemann. 2022. Question Answering Classification for Amharic Social Media Community Based Questions. In Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, pages 137–145, Marseille, France. European Language Resources Association.
Cite (Informal):
Question Answering Classification for Amharic Social Media Community Based Questions (Destaw et al., SIGUL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.sigul-1.18.pdf
Code
 uhh-lt/amharicmodels