Anthony Hartley

Also published as: Anthony F. Hartley, A. Hartley

2017

pdf bib abs
Consistent Classification of Translation Revisions: A Case Study of English-Japanese Student Translations
Atsushi Fujita | Kikuko Tanabe | Chiho Toyoshima | Mayuka Yamamoto | Kyo Kageura | Anthony Hartley
Proceedings of the 11th Linguistic Annotation Workshop

Consistency is a crucial requirement in text annotation. It is especially important in educational applications, as lack of consistency directly affects learners’ motivation and learning performance. This paper presents a quality assessment scheme for English-to-Japanese translations produced by learner translators at university. We constructed a revision typology and a decision tree manually through an application of the OntoNotes method, i.e., an iteration of assessing learners’ translations and hypothesizing the conditions for consistent decision making, as well as re-organizing the typology. Intrinsic evaluation of the created scheme confirmed its potential contribution to the consistent classification of identified erroneous text spans, achieving visibly higher Cohen’s kappa values, up to 0.831, than previous work. This paper also describes an application of our scheme to an English-to-Japanese translation exercise course for undergraduate students at a university in Japan.

2016

pdf bib abs
MuTUAL: A Controlled Authoring Support System Enabling Contextual Machine Translation
Rei Miyata | Anthony Hartley | Kyo Kageura | Cécile Paris | Masao Utiyama | Eiichiro Sumita
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

The paper introduces a web-based authoring support system, MuTUAL, which aims to help writers create multilingual texts. The highlighted feature of the system is that it enables machine translation (MT) to generate outputs appropriate to their functional context within the target document. Our system is operational online, implementing core mechanisms for document structuring and controlled writing. These include a topic template and a controlled language authoring assistant, linked to our statistical MT system.

We report on an on-going research project aimed at increasing the range of translation equivalents which can be automatically discovered by MT systems. The methodology is based on semi-supervised learning of indirect translation strategies from large comparable corpora and applying them in run-time to generate novel, previously unseen translation equivalents. This approach is different from methods based on parallel resources, which currently can reuse only individual translation equivalents. Instead it models translation strategies which generalise individual equivalents and can successfully generate an open class of new translation solutions. The task of the project is integration of the developed technology into open-source MT systems.

pdf bib abs
Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods
Bogdan Babych | Anthony Hartley
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report the results of our experiment on assessing the ability of automated MT evaluation metrics to remain sensitive to variations in MT quality as the average quality of the compared systems goes up. We compare two groups of metrics: those, which measure the proximity of MT output to some reference translation, and those which evaluate the performance of some automated process on degraded MT output. The experiment shows that proximity-based metrics (such as BLEU) loose sensitivity as the scores go up, but performance-based metrics (e.g., Named Entity recognition from MT output) remain sensitive across the scale. We suggest a model for explaining this result, which attributes stable sensitivity of performance-based metrics to measuring cumulative functional effect of different language levels, while proximity-based metrics measure structural matches on a lexical level and therefore miss higher-level errors that are more typical for better MT systems. Development of new automated metrics should take into account possible decline in sensitivity on higher-quality MT, which should be tested as part of meta-evaluation of the metrics.

pdf bib abs
Corpus-Based Tools for Computer-Assisted Acquisition of Reading Abilities in Cognate Languages
Svitlana Kurella | Serge Sharoff | Anthony Hartley
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents an approach to computer-assisted teaching of reading abilities using corpus data. The approach is supported by a set of tools for automatically selecting and classifying texts retrieved from the Internet. The approach is based on a linguistic model of textual cohesion which describes relations between larger textual units that go beyond the sentence level. We show that textual connectors that link such textual units reliably predict different types of texts, such as information and opinion: using only textual connectors as features, an SVM classifier achieves an F-score of between 0.85 and 0.93 for predicting these classes. The tools are used in our project on teaching reading skills in a cognate foreign language (L3) which is cognate to a known foreign language (L2).

2007

pdf bib
Translating from under-resourced languages: comparing direct transfer against pivot translation
Bogdan Babych | Anthony Hartley | Serge Sharoff
Proceedings of Machine Translation Summit XI: Papers

pdf bib
Assessing human and automated quality judgments in the French MT evaluation campaign CESTA
Olivier Hamon | Anthony Hartley | Andrei Popescu-Belis | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers

bib
Sensitivity of automated models for MT evaluation: proximity-based vs. performance-based methods
Bogdan Babych | Anthony Hartley
Proceedings of the Workshop on Automatic procedures in MT evaluation

pdf bib
A dynamic dictionary for discovering indirect translation equivalents
Bogdan Babych | Anthony Hartley | Serge Sharoff
Proceedings of Translating and the Computer 29

pdf bib
Assisting Translators in Indirect Lexical Transfer
Bogdan Babych | Anthony Hartley | Serge Sharoff | Olga Mudraya
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib abs
Using collocations from comparable corpora to find translation equivalents
Serge Sharoff | Bogdan Babych | Anthony Hartley
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we present a tool for finding appropriate translation equivalents for words from the general lexicon using comparable corpora. For a phrase in the source language the tool suggests arange of possible expressions used in similar contexts in target language corpora. In the paper we discuss the method and present results of human evaluation of the performance of the tool.

This article outlines the evaluation protocol and provides the main results of the French Evaluation Campaign for Machine Translation Systems, CESTA. Following the initial objectives and evaluation plans, the evaluation metrics are briefly described: along with fluency and adequacy assessed by human judges, a number of recently proposed automated metrics are used. Two evaluation campaigns were organized, the first one in the general domain, and the second one in the medical domain. Up to six systems translating from English into French, and two systems translating from Arabic into French, took part in the campaign. The numerical results illustrate the differences between classes of systems, and provide interesting indications about the reliability of the automated metrics for French as a target language, both by comparison to human judges and using correlations between metrics. The corpora that were produced, as well as the information about the reliability of metrics, constitute reusable resources for MT evaluation.

pdf bib
Using Comparable Corpora to Solve Problems Difficult for Human Translators
Serge Sharoff | Bogdan Babych | Anthony Hartley
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib abs
Estimating the predictive Power of N-gram MT Evaluation Metrics across Language and Text Types
Bogdan Babych | Anthony Hartley | Debbie Elliott
Proceedings of Machine Translation Summit X: Posters

The use of n-gram metrics to evaluate the output of MT systems is widespread. Typically, they are used in system development, where an increase in the score is taken to represent an improvement in the output of the system. However, purchasers of MT systems or services are more concerned to know how well a score predicts the acceptability of the output to a reader-user. Moreover, they usually want to know if these predictions will hold across a range of target languages and text types. We describe an experiment involving human and automated evaluations of four MT systems across two text types and 23 language directions. It establishes that the correlation between human and automated scores is high, but that the predictive power of these scores depends crucially on target language and text type.

2004

pdf bib abs
A fluency error categorization scheme to guide automated machine translation evaluation
Debbie Elliott | Anthony Hartley | Eric Atwell
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers

Existing automated MT evaluation methods often require expert human translations. These are produced for every language pair evaluated and, due to this expense, subsequent evaluations tend to rely on the same texts, which do not necessarily reflect real MT use. In contrast, we are designing an automated MT evaluation system, intended for use by post-editors, purchasers and developers, that requires nothing but the raw MT output. Furthermore, our research is based on texts that reflect corporate use of MT. This paper describes our first step in system design: a hierarchical classification scheme of fluency errors in English MT output, to enable us to identify error types and frequencies, and guide the selection of errors for automated detection. We present results from the statistical analysis of 20,000 words of MT output, manually annotated using our classification scheme, and describe correlations between error frequencies and human scores for fluency and adequacy.

pdf bib
Disambiguating translation strategies in MT using automatic named entity recognition
Bogdan Babych | Anthony Hartley
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

pdf bib
Extending MT evaluation tools with translation complexity metrics
Bogdan Babych | Debbie Elliott | Anthony Hartley
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Calibrating Resource-light Automatic MT Evaluation: a Cheap Approach to Ranking MT Systems by the Usability of Their Output
Bogdan Babych | Debbie Elliott | Anthony Hartley
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Modelling Legitimate Translation Variation for Automatic Evaluation of MT Quality
Bogdan Babych | Anthony Hartley
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Multilingual generation of controlled languages
Richard Power | Donia Scott | Anthony Hartley
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

pdf bib
Improving Machine Translation Quality with Automatic Named Entity Recognition
Bogdan Babych | Anthony Hartley
Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL 2003

2002

pdf bib
Automatic Ranking of MT Systems
Martin Rajman | Anthony Hartley
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf bib abs
AGILE - a system for multilingual generation of technical instructions
Anthony Hartley | Donia Scott | John Bateman | Danail Dochev
Proceedings of Machine Translation Summit VIII

This paper presents a multilingual Natural Language Generation system that produces technical instruction texts in Bulgarian, Czech and Russian. It generates several types of texts, common for software manuals, in two styles. We illustrate the system’s functionality with examples of its input and output behaviour. We discuss the criteria and procedures adopted for evaluating the system and summarise their results. The system embodies novel approaches to providing multilingual documentation, ranging from the re-use of a large-scale, broad coverage grammar of English in order to develop the lexico-grammatical resources necessary for the generation in the three target languages, through to the adoption of a ‘knowledge editing’ approach to specifying the desired content of the texts to be generated independently of the target languages in which those texts finally appear.

pdf bib
Evaluating Text Quality: Judging Output Texts Without a Clear Source
Anthony Hartley | Donia Scott
Proceedings of the ACL 2001 Eighth European Workshop on Natural Language Generation (EWNLG)