Two Decades of Terminology: European Framework Programmes Titles
Gabriella Pardelli | Sara Goggi | Silvia Giannini | Stefania Biagioni
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This work analyses a corpus made of the titles of research projects belonging to the last four European Commission Framework Programmes (FP4, FP5, FP6, FP7) during a time span of nearly two decades (1994-2012). The starting point is the idea of creating a corpus of titles which would constitute a terminological niche, a sort of “cluster map” offering an overall vision on the terms used and the links between them. Moreover, by performing a terminological comparison over a period of time it is possible to trace the presence of obsolete words in outdated research areas as well as of neologisms in the most recent fields. Within this scenario, the minimal purpose is to build a corpus of titles of European projects belonging to the several Framework Programmes in order to obtain a terminological mapping of relevant words in the various research areas: particularly significant would be those terms spread across different domains or those extremely tied to a specific domain. A term could actually be found in many fields and being able to acknowledge and retrieve this cross-presence means being able to linking those different domains by means of a process of terminological mapping.
From medical language processing to BioNLP domain
Gabriella Pardelli | Manuela Sassi | Sara Goggi | Stefania Biagioni
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper presents the results of a terminological work on a reference corpus in the domain of Biomedicine. In particular, the research tends to analyse the use of certain terms in Biomedicine in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. The terminological sample contains words used in BioNLP and biomedicine and identifies which terms are passing from scientific publications to the daily press and which are rather reserved to scientific production. The final scope of this work is to determine how scientific dissemination to an ever larger part of the society enables a public of common citizens to approach communication on biomedical research and development; and its main source is a reference corpus made up of three main repositories from which information related to BioNLP and Biomedicine is extracted. The paper is divided in three sections: 1) an introduction dedicated to data extracted from scientific documentation; 2) the second section devoted to methodology and data description; 3) the third part containing a statistical representation of terms extracted from the archive: indexes and concordances allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge in the digital era.
A Digital Archive of Research Papers in Computer Science
Manuela Sassi | Gabriella Pardelli | Stefania Biagioni | Carlo Carlesi | Sara Goggi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper presents the results of a terminological work conducted by the authors on a Digital Archives Net of the Italian National Research Council (CNR) in the field of Computer Science. In particular, the research tends to analyse the use of certain terms in Computer Science in order to verify their change over the time with the aim of retrieving from the net the very essence of documentation. Its main source is a reference corpus made up of 13,500 documents which collects the scientific productions of CNR. This study is divided in three sections: 1) an introductory one dedicated to the data extracted from the scientific documentation; 2) the second section is devoted to the description of the contents managed by the PUMA system; 3) the third part contains a statistical representation of terms extracted from archive: some comparison tables between the occurrences of the most used terms in the scientific documentation will be created and diagrams with percentages about the most frequently used terms will be displayed too. Indexes and concordances will allow to reflect on the use of certain terms in this field and give possible keys for having access to the extraction of knowledge.