A Simple and Effective Dependency Parser for Telugu

Sneha Nallani, Manish Shrivastava, Dipti Sharma


Abstract
We present a simple and effective dependency parser for Telugu, a morphologically rich, free word order language. We propose to replace the rich linguistic feature templates used in the past approaches with a minimal feature function using contextual vector representations. We train a BERT model on the Telugu Wikipedia data and use vector representations from this model to train the parser. Each sentence token is associated with a vector representing the token in the context of that sentence and the feature vectors are constructed by concatenating two token representations from the stack and one from the buffer. We put the feature representations through a feedforward network and train with a greedy transition based approach. The resulting parser has a very simple architecture with minimal feature engineering and achieves state-of-the-art results for Telugu.
Anthology ID:
2020.acl-srw.19
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
July
Year:
2020
Address:
Online
Editors:
Shruti Rijhwani, Jiangming Liu, Yizhong Wang, Rotem Dror
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
143–149
Language:
URL:
https://aclanthology.org/2020.acl-srw.19
DOI:
10.18653/v1/2020.acl-srw.19
Bibkey:
Cite (ACL):
Sneha Nallani, Manish Shrivastava, and Dipti Sharma. 2020. A Simple and Effective Dependency Parser for Telugu. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 143–149, Online. Association for Computational Linguistics.
Cite (Informal):
A Simple and Effective Dependency Parser for Telugu (Nallani et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-srw.19.pdf
Dataset:
 2020.acl-srw.19.Dataset.zip
Video:
 http://slideslive.com/38928657