Thatsanee Charoenporn

This paper presents the language resource management system for the development and dissemination of Asian WordNet (AWN) and its web service application. We develop the platform to establish a network for the cross language WordNet development. Each node of the network is designed for maintaining the WordNet for a language. Via the table that maps between each language WordNet and the Princeton WordNet (PWN), the Asian WordNet is realized to visualize the cross language WordNet between the Asian languages. We propose a language resource management system, called WordNet Management System (WNMS), as a distributed management system that allows the server to perform the cross language WordNet retrieval, including the fundamental web service applications for editing, visualizing and language processing. The WNMS is implemented on a web service protocol therefore each node can be independently maintained, and the service of each language WordNet can be called directly through the web service API. In case of cross language implementation, the synset ID (or synset offset) defined by PWN is used to determined the linkage between the languages.

2009

pdf bib

pdf bib

2008

bib abs

Corpus-based approaches and statistical approaches have been the main stream of natural language processing research for the past two decades. Language resources play a key role in such approaches, but there is an insufficient amount of language resources in many Asian languages. In this situation, standardisation of language resources would be of great help in developing resources in new languages. This paper presents the latest development efforts of our project which aims at creating a common standard for Asian language resources that is compatible with an international standard. In particular, the paper focuses on i) lexical specification and data categories relevant for building multilingual lexical resources for Asian languages; ii) a core upper-layer ontology needed for ensuring multilingual interoperability and iii) the evaluation platform used to test the entire architectural framework.

pdf bib

Enhanced Tools for Online Collaborative Language Resource Development
Virach Sornlertlamvanich | Thatsanee Charoenporn | Suphanut Thayaboon | Chumpol Mokarat | Hitoshi Isahara
Proceedings of the 6th Workshop on Asian Language Resources

pdf bib

KUI: an ubiquitous tool for collective intelligence development
Thatsanee Charoenporn | Virach Sornlertlamvanich | Hitoshi Isahara | Kergrit Robkop
Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages

pdf bib

Synset Assignment for Bi-lingual Dictionary with Limited Resource
Virach Sornlertlamvanich | Thatsanee Charoenporn | Chumpol Mokarat | Hitoshi Isahara | Hammam Riza | Purev Jaimai
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib

2006

pdf bib

bib abs

Word Knowledge Acquisition for Computational Lexicon Construction
Thatsanee Charoenporn | Canasai Kruengkrai | Thanaruk Theeramunkong | Virach Sornlertlamvanich | Hitoshi Isahara
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The growing of multilingual information processing technology has created the need of linguistic resources, especially lexical database. Many attempts were put to alter the traditional dictionary to computational dictionary, or widely named as computational lexicon. TCL’s Computational Lexicon (TCLLEX) is a recent development of a large-scale Thai Lexicon, which aims to serve as a fundamental linguistic resource for natural language processing research. We design either terminology or ontology for structuring the lexicon based on the idea of computability and reusability.