An Open Source Persian Computational Grammar

Shafqat Mumtaz Virk, Elnaz Abolahrar


Abstract
In this paper, we describe a multilingual open-source computational grammar of Persian, developed in Grammatical Framework (GF) ― A type-theoretical grammar formalism. We discuss in detail the structure of different syntactic (i.e. noun phrases, verb phrases, adjectival phrases, etc.) categories of Persian. First, we show how to structure and construct these categories individually. Then we describe how they are glued together to make well-formed sentences in Persian, while maintaining the grammatical features such as agreement, word order, etc. We also show how some of the distinctive features of Persian, such as the ezafe construction, are implemented in GF. In order to evaluate the grammar's correctness, and to demonstrate its usefulness, we have added support for Persian in a multilingual application grammar (the Tourist Phrasebook) using the reported resource grammar.
Anthology ID:
L12-1614
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1686–1693
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1028_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Shafqat Mumtaz Virk and Elnaz Abolahrar. 2012. An Open Source Persian Computational Grammar. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1686–1693, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
An Open Source Persian Computational Grammar (Virk & Abolahrar, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1028_Paper.pdf