Ulla Gerdin


2010

pdf bib
A Swedish Scientific Medical Corpus for Terminology Management and Linguistic Exploration
Dimitrios Kokkinakis | Ulla Gerdin
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes the development of a new Swedish scientific medical corpus. We provide a detailed description of the characteristics of this new collection as well results of an application of the corpus on term management tasks, including terminology validation and terminology extraction. Although the corpus is representative for the scientific medical domain it still covers in detail a lot of specialised sub-disciplines such as diabetes and osteoporosis which makes it suitable for facilitating the production of smaller but more focused sub-corpora. We address this issue by making explicit some features of the corpus in order to demonstrate the usability of the corpus particularly for the quality assessment of subsets of official terminologies such as the Systematized NOmenclature of MEDicine - Clinical Terms (SNOMED CT). Domain-dependent language resources, labelled or not, are a crucial key components for progressing R&D in the human language technology field since such resources are an indispensable, integrated part for terminology management, evaluation, software prototyping and design validation and a prerequisite for the development and evaluation of a number of sublanguage dependent applications including information extraction, text mining and information retrieval.

2009

pdf bib
Issues on Quality Assessment of SNOMED CT® Subsets – Term Validation and Term Extraction
Dimitrios Kokkinakis | Ulla Gerdin
Proceedings of the Workshop on Biomedical Information Extraction