2024
pdf
bib
abs
The Multi-Range Theory of Translation Quality Measurement: MQM scoring models and Statistical Quality Control
Arle Lommel
|
Serge Gladkoff
|
Alan Melby
|
Sue Ellen Wright
|
Ingemar Strandvik
|
Katerina Gasova
|
Angelika Vaasa
|
Andy Benzo
|
Romina Marazzato Sparano
|
Monica Foresi
|
Johani Innis
|
Lifeng Han
|
Goran Nenadic
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 2: Presentations)
The year 2024 marks the 10th anniversary of the Multidimensional Quality Metrics (MQM) framework for analytic translation quality evaluation. The MQM error typology has been widely used by practitioners in the translation and localization industry and has served as the basis for many derivative projects. The annual Conference on Machine Translation (WMT) shared tasks on both human and automatic translation quality evaluations used the MQM error typology. The metric stands on two pillars: error typology and the scoring model. The scoring model calculates the quality score from annotation data, detailing how to convert error type and severity counts into numeric scores to determine if the content meets specifications. Previously, only the raw scoring model had been published. This April, the MQM Council published the Linear Calibrated Scoring Model, officially presented herein, along with the Non-Linear Scoring Model, which had not been published
2018
pdf
bib
Termbase Exchange (TBX)
Sue Wright
Proceedings of the AMTA 2018 Workshop on The Role of Authoritative Standards in the MT Environment
2008
pdf
bib
abs
ISOcat: Corralling Data Categories in the Wild
Marc Kemps-Snijders
|
Menzo Windhouwer
|
Peter Wittenburg
|
Sue Ellen Wright
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
To achieve true interoperability for valuable linguistic resources different levels of variation need to be addressed. ISO Technical Committee 37, Terminology and other language and content resources, is developing a Data Category Registry. This registry will provide a reusable set of data categories. A new implementation, dubbed ISOcat, of the registry is currently under construction. This paper shortly describes the new data model for data categories that will be introduced in this implementation. It goes on with a sketch of the standardization process. Completed data categories can be reused by the community. This is done by either making a selection of data categories using the ISOcat web interface, or by other tools which interact with the ISOcat system using one of its various Application Programming Interfaces. Linguistic resources that use data categories from the registry should include persistent references, e.g. in the metadata or schemata of the resource, which point back to their origin. These data category references can then be used to determine if two or more resources share common semantics, thus providing a level of interoperability close to the source data and a promising layer for semantic alignment on higher levels.
2004
pdf
bib
A Global Data Category Registry for Interoperable Language Resources
Sue Ellen Wright
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
1999
pdf
bib
Integrating Translation Technologies Using SALT
Gerhard Budin
|
Alan K. Melby
|
Sue Ellen Wright
|
Deryle Lonsdale
|
Arle Lommel
Proceedings of Translating and the Computer 21
1991
pdf
bib
TEI-TERM: an SGML-based interchange format for terminology files The EuroTermBank
Alan Melby
|
Sue Ellen Wright
Proceedings of Translating and the Computer 13: The theory and practice of machine translation – a marriage of convenience?