Federico Gaspari


2021

bib
Building MT systems in low resourced languages for Public Sector users in Croatia, Iceland, Ireland, and Norway
Róisín Moran | Carla Para Escartín | Akshai Ramesh | Páraic Sheridan | Jane Dunne | Federico Gaspari | Sheila Castilho | Natalia Resende | Andy Way
Proceedings of Machine Translation Summit XVIII: Users and Providers Track

When developing Machine Translation engines, low resourced language pairs tend to be in a disadvantaged position: less available data means that developing robust MT models can be more challenging.The EU-funded PRINCIPLE project aims at overcoming this challenge for four low resourced European languages: Norwegian, Croatian, Irish and Icelandic. This presentation will give an overview of the project, with a focus on the set of Public Sector users and their use cases for which we have developed MT solutions.We will discuss the range of language resources that have been gathered through contributions from public sector collaborators, and present the extensive evaluations that have been undertaken, including significant user evaluation of MT systems across all of the public sector participants in each of the four countries involved.

2020

pdf bib
ELRI: A Decentralised Network of National Relay Stations to Collect, Prepare and Share Language Resources
Thierry Etchegoyhen | Borja Anza Porras | Andoni Azpeitia | Eva Martínez Garcia | José Luis Fonseca | Patricia Fonseca | Paulo Vale | Jane Dunne | Federico Gaspari | Teresa Lynn | Helen McHugh | Andy Way | Victoria Arranz | Khalid Choukri | Hervé Pusset | Alexandre Sicard | Rui Neto | Maite Melero | David Perez | António Branco | Ruben Branco | Luís Gomes
Proceedings of the 1st International Workshop on Language Technology Platforms

We describe the European Language Resource Infrastructure (ELRI), a decentralised network to help collect, prepare and share language resources. The infrastructure was developed within a project co-funded by the Connecting Europe Facility Programme of the European Union, and has been deployed in the four Member States participating in the project, namely France, Ireland, Portugal and Spain. ELRI provides sustainable and flexible means to collect and share language resources via National Relay Stations, to which members of public institutions can freely subscribe. The infrastructure includes fully automated data processing engines to facilitate the preparation, sharing and wider reuse of useful language resources that can help optimise human and automated translation services in the European Union.

pdf bib
Progress of the PRINCIPLE Project: Promoting MT for Croatian, Icelandic, Irish and Norwegian
Andy Way | Petra Bago | Jane Dunne | Federico Gaspari | Andre Kåsen | Gauti Kristmannsson | Helen McHugh | Jon Arild Olsen | Dana Davis Sheridan | Páraic Sheridan | John Tinsley
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

This paper updates the progress made on the PRINCIPLE project, a 2-year action funded by the European Commission under the Connecting Europe Facility (CEF) programme. PRINCIPLE focuses on collecting high-quality language resources for Croatian, Icelandic, Irish and Norwegian, which have been identified as low-resource languages, especially for building effective machine translation (MT) systems. We report initial achievements of the project and ongoing activities aimed at promoting the uptake of neural MT for the low-resource languages of the project.

2019

pdf bib
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks
Mikel Forcada | Andy Way | John Tinsley | Dimitar Shterionov | Celia Rico | Federico Gaspari
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf bib
PRINCIPLE: Providing Resources in Irish, Norwegian, Croatian and Icelandic for the Purposes of Language Engineering
Andy Way | Federico Gaspari
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf bib
Large-scale Machine Translation Evaluation of the iADAATPA Project
Sheila Castilho | Natália Resende | Federico Gaspari | Andy Way | Tony O’Dowd | Marek Mazur | Manuel Herranz | Alex Helle | Gema Ramírez-Sánchez | Víctor Sánchez-Cartagena | Mārcis Pinnis | Valters Šics
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

2018

pdf bib
Improving Machine Translation of Educational Content via Crowdsourcing
Maximiliana Behnke | Antonio Valerio Miceli Barone | Rico Sennrich | Vilelmini Sosoni | Thanasis Naskos | Eirini Takoulidou | Maria Stasimioti | Menno van Zaanen | Sheila Castilho | Federico Gaspari | Panayota Georgakopoulou | Valia Kordoni | Markus Egg | Katia Lida Kermanidis
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources
Randy Scansani | Silvia Bernardini | Adriano Ferraresi | Federico Gaspari | Marcello Soffritti
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

This paper describes an approach to translating course unit descriptions from Italian and German into English, using a phrase-based machine translation (MT) system. The genre is very prominent among those requiring translation by universities in European countries in which English is a non-native language. For each language combination, an in-domain bilingual corpus including course unit and degree program descriptions is used to train an MT engine, whose output is then compared to a baseline engine trained on the Europarl corpus. In a subsequent experiment, a bilingual terminology database is added to the training sets in both engines and its impact on the output quality is evaluated based on BLEU and post-editing score. Results suggest that the use of domain-specific corpora boosts the engines quality for both language combinations, especially for German-English, whereas adding terminological resources does not seem to bring notable benefits.

2016

pdf bib
Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities
Meritxell Fernández Barrera | Vladimir Popescu | Antonio Toral | Federico Gaspari | Khalid Choukri
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce,by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specific challenges concerning the automatic translation of user-generated content are discussed in detail. Secondly, the paper highlights the risk of data sparsity inherent to e-commerce and it explores the state-of-the-art strategies to achieve domain adequacy via adaptation. Thirdly, it proposes a robust workflow for the development of SMT systems adapted to the e-commerce domain by relying on inexpensive methods. Given the scarcity of user-generated language corpora for most language pairs, the paper proposes to obtain monolingual target-language data to train language models and aligned parallel corpora to tune and evaluate MT systems by means of crowdsourcing.

pdf bib
TraMOOC (Translation for Massive Open Online Courses): providing reliable MT for MOOCs
Valia Kordoni | Lexi Birch | Ioana Buliga | Kostadin Cholakov | Markus Egg | Federico Gaspari | Yota Georgakopolou | Maria Gialama | Iris Hendrickx | Mitja Jermol | Katia Kermanidis | Joss Moorkens | Davor Orlic | Michael Papadopoulos | Maja Popović | Rico Sennrich | Vilelmini Sosoni | Dimitrios Tsoumakos | Antal van den Bosch | Menno van Zaanen | Andy Way
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products

2014

pdf bib
Perception vs. reality: measuring machine translation post-editing productivity
Federico Gaspari | Antonio Toral | Sudip Kumar Naskar | Declan Groves | Andy Way
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas

This paper presents a study of user-perceived vs real machine translation (MT) post-editing effort and productivity gains, focusing on two bidirectional language pairs: English—German and English—Dutch. Twenty experienced media professionals post-edited statistical MT output and also manually translated comparative texts within a production environment. The paper compares the actual post-editing time against the users’ perception of the effort and time required to post-edit the MT output to achieve publishable quality, thus measuring real (vs perceived) productivity gains. Although for all the language pairs users perceived MT post-editing to be slower, in fact it proved to be a faster option than manual translation for two translation directions out of four, i.e. for Dutch to English, and (marginally) for English to German. For further objective scrutiny, the paper also checks the correlation of three state-of-the-art automatic MT evaluation metrics (BLEU, METEOR and TER) with the actual post-editing time.

2013

pdf bib
Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Declan Groves
Proceedings of Machine Translation Summit XIV: Papers

pdf bib
A Web Application for the Diagnostic Evaluation of Machine Translation over Specific Linguistic Phenomena
Antonio Toral | Sudip Kumar Naskar | Joris Vreeke | Federico Gaspari | Declan Groves
Proceedings of the 2013 NAACL HLT Demonstration Session

2011

pdf bib
A Framework for Diagnostic Evaluation of MT Based on Linguistic Checkpoints
Sudip Kumar Naskar | Antonio Toral | Federico Gaspari | Andy Way
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
A Comparative Evaluation of Research vs. Online MT Systems
Antonio Toral | Federico Gaspari | Sudip Kumar Naskar | Andy Way
Proceedings of the 15th Annual conference of the European Association for Machine Translation

2007

pdf bib
Making a sow’s ear out of a silk purse: (mis)using online MT services as bilingual dictionaries
Federico Gaspari | Harold Somers
Proceedings of Translating and the Computer 29

pdf bib
Online and free! Ten years of online machine translation: origins, developments, current use and future prospects
Federico Gaspari | John Hutchins
Proceedings of Machine Translation Summit XI: Papers

pdf bib
Using free online MT in multilingual websites
Federico Gaspari | Harold Somers
Proceedings of Machine Translation Summit XI: Tutorials

2006

pdf bib
The Added Value of Free Online MT Services
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper reports on an experiment investigating how effective free online machine translation (MT) is in helping Internet users to access the contents of websites written only in languages they do not know. This study explores the extent to which using Internet-based MT tools affects the confidence of web-surfers in the reliability of the information they find on websites available only in languages unfamiliar to them. The results of a case study for the language pair Italian-English involving 101 participants show that the chances of identifying correctly basic information (i.e. understanding the nature of websites and finding contact telephone numbers from their web-pages) are consistently enhanced to varying degrees (up to nearly 20%) by translating online content into a familiar language. In addition, confidence ratings given by users to the reliability and accuracy of the information they find are significantly higher (with increases between 5 and 11%) when they translate websites into their preferred language with free online MT services.

pdf bib
The Added Value of Free Online MT Services
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: User Track Presentations

pdf bib
The social impact of online MT
Federico Gaspari
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Panel on machine translation for social impact

pdf bib
Detecting Inappropriate Use of Free Online Machine Translation by Language Students. A Special Case of Plagiarism Detection
Harold Somers | Federico Gaspari | Ana Niño
Proceedings of the 11th Annual conference of the European Association for Machine Translation

pdf bib
Look Who’s Translating. Impersonations, Chinese Whispers and Fun with Machine Translation on the Internet
Federico Gaspari
Proceedings of the 11th Annual conference of the European Association for Machine Translation

2005

pdf bib
Embedding Free Online Machine Translation into Monolingual Websites for Multilingual Dissemination: a Case Study of Implementation
Federico Gaspari
Translating and the Computer 27

2004

pdf bib
Integrating on-line MT services into monolingual web-sites for dissemination purposes: an evaluation perspective
Federico Gaspari
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

pdf bib
Online MT services and real users’ needs: an empirical usability evaluation
Federico Gaspari
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper presents an empirical evaluation of the main usability factors that play a significant role in the interaction with on-line Machine Translation (MT) services. The investigation is carried out from the point of view of typical users with an emphasis on their real needs, and focuses on a set of key usability criteria that have an impact on the successful deployment of Internet-based MT technology. A small-scale evaluation of the performance of five popular web-based MT systems against the selected usability criteria shows that different approaches to interaction design can dramatically affect the level of user satisfaction. There are strong indications that the results of this study can be fed back into the development of on-line MT services to enhance their design, thus ensuring that they meet the requirements and expectations of a wide range of Internet users.

2002

pdf bib
Using free on-line services in MT training
Federico Gaspari
Proceedings of the 6th EAMT Workshop: Teaching Machine Translation

2001

pdf bib
Teaching machine translation to trainee translators: a survey of their knowledge and opinions
Federico Gaspari
Workshop on Teaching Machine Translation

This paper reports upon a survey carried out among thirty-eight trainee translators who took courses on machine translation. The survey was conducted asking the sample of students to fill out a questionnaire both at the beginning and at the end of the MT course. The questions aimed at assessing the degree of knowledge about MT of the respondents and the opinions and impressions that they accordingly had on it. The results of the questionnaire were elaborated so as to investigate the relationship between the increase in the knowledge about MT after the conclusion of the course, and the corresponding change in the students’ attitude towards the discipline, which became much less biased and in general fairly positive, thanks to a very successful and rewarding learning process. The paper suggests that the more the trainee translators became familiar with MT, realising its reasonable potential and current limitations, the less afraid they were of it. These findings encourage the increasing integration and introduction of technology into translation curricula, since the impact of computer technology on language translation directly affects professional human translators. As a result, exposing trainee translators to machine translation seems to raise the profile of their training.