Retrieving Terminological Data from the TxtCeram Tagged Domain Corpus: A First Step towards a Terminological Ontology

Anna Estellés, Amparo Alcina, Victoria Soler


Abstract
In this paper we will focus on corpora as a resource for researching language processing for terminological purposes. Based on the TEI guide, we present the templates used to tag our TxtCeram corpus and its features when working with WordSmith, a text analysis tool. We present an experiment for studying the frequency of hyperonyms in the introduction section of texts, while testing WordSmith’s suitability to work with our tagged corpus.
Anthology ID:
L06-1239
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/408_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Anna Estellés, Amparo Alcina, and Victoria Soler. 2006. Retrieving Terminological Data from the TxtCeram Tagged Domain Corpus: A First Step towards a Terminological Ontology. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Retrieving Terminological Data from the TxtCeram Tagged Domain Corpus: A First Step towards a Terminological Ontology (Estellés et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/408_pdf.pdf