A Conditional Random Field Framework for Thai Morphological Analysis

Canasai Kruengkrai, Virach Sornlertlamvanich, Hitoshi Isahara


Abstract
This paper presents a framework for Thai morphological analysis based on the theoretical background of conditional random fields. We formulate morphological analysis of an unsegmented language as the sequential supervised learning problem. Given a sequence of characters, all possibilities of word/tag segmentation are generated, and then the optimal path is selected with some criterion. We examine two different techniques, including the Viterbi score and the confidence estimation. Preliminary results are given to show the feasibility of our proposed framework.
Anthology ID:
L06-1069
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/137_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Canasai Kruengkrai, Virach Sornlertlamvanich, and Hitoshi Isahara. 2006. A Conditional Random Field Framework for Thai Morphological Analysis. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
A Conditional Random Field Framework for Thai Morphological Analysis (Kruengkrai et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/137_pdf.pdf