BERT-Based Sequence Labelling Approach for Dependency Parsing in Tamil

C S Ayush Kumar, Advaith Maharana, Srinath Murali, Premjith B, Soman Kp


Abstract
Dependency parsing is a method for doing surface-level syntactic analysis on natural language texts. The scarcity of any viable tools for doing these tasks in Dravidian Languages has introduced a new line of research into these topics. This paper focuses on a novel approach that uses word-to-word dependency tagging using BERT models to improve the malt parser performance. We used Tamil, a morphologically rich and free word language. The individual words are tokenized using BERT models and the dependency relations are recognized using Machine Learning Algorithms. Oversampling algorithms such as SMOTE (Chawla et al., 2002) and ADASYN (He et al., 2008) are used to tackle data imbalance and consequently improve parsing results. The results obtained are used in the malt parser and this can be accustomed to further highlight that feature-based approaches can be used for such tasks.
Anthology ID:
2022.dravidianlangtech-1.1
Volume:
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Parameswari Krishnamurthy, Elizabeth Sherly, Sinnathamby Mahesan
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–8
Language:
URL:
https://aclanthology.org/2022.dravidianlangtech-1.1
DOI:
10.18653/v1/2022.dravidianlangtech-1.1
Bibkey:
Cite (ACL):
C S Ayush Kumar, Advaith Maharana, Srinath Murali, Premjith B, and Soman Kp. 2022. BERT-Based Sequence Labelling Approach for Dependency Parsing in Tamil. In Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pages 1–8, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
BERT-Based Sequence Labelling Approach for Dependency Parsing in Tamil (Kumar et al., DravidianLangTech 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.dravidianlangtech-1.1.pdf
Video:
 https://aclanthology.org/2022.dravidianlangtech-1.1.mp4
Data
Universal Dependencies