Software Tools for Morphological Tagging of Zulu Corpora and Lexicon Development

Sonja E. Bosch, Laurette Pretorius


Abstract
The aim of this paper is to discuss aspects of an on-going project on the development of grammatical and lexical resources for Zulu with sufficient coverage for unrestricted text. We explain how the basic software tools of computational morphology are used in linguistic processing, more specifically for automatic word form recognition and morphological tagging of the growing stock of electronic text corpora of a Bantu language such as Zulu. It is also shown how a machine-readable lexicon is in turn enhanced with the information acquired and extracted by means of such corpus analysis.
Anthology ID:
L04-1251
Volume:
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Month:
May
Year:
2004
Address:
Lisbon, Portugal
Editors:
Maria Teresa Lino, Maria Francisca Xavier, Fátima Ferreira, Rute Costa, Raquel Silva
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/427.pdf
DOI:
Bibkey:
Cite (ACL):
Sonja E. Bosch and Laurette Pretorius. 2004. Software Tools for Morphological Tagging of Zulu Corpora and Lexicon Development. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal. European Language Resources Association (ELRA).
Cite (Informal):
Software Tools for Morphological Tagging of Zulu Corpora and Lexicon Development (Bosch & Pretorius, LREC 2004)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/427.pdf