Felix Sasaki

2018

A Framework for the Needs of Different Types of Users in Multilingual Semantic Enrichment
Jan Nehring | Felix Sasaki
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib

Digital curation technologies (DKT)
Georg Rehm | Felix Sasaki
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products

pdf bib

How to configure statistical machine translation with linked open data resources
Ankit Srivastava | Felix Sasaki | Peter Bourgonje. Julian Moreno-Schneider | Jan Nehring | Georg Rehm
Proceedings of Translating and the Computer 38

pdf bib abs

In the recent years, Linked Data and Language Technology solutions gained popularity. Nevertheless, their coupling in real-world business is limited due to several issues. Existing products and services are developed for a particular domain, can be used only in combination with already integrated datasets or their language coverage is limited. In this paper, we present an innovative solution FREME - an open framework of e-Services for multilingual and semantic enrichment of digital content. The framework integrates six interoperable e-Services. We describe the core features of each e-Service and illustrate their usage in the context of four business cases: i) authoring and publishing; ii) translation and localisation; iii) cross-lingual access to data; and iv) personalised Web content recommendations. Business cases drive the design and development of the framework.

pdf bib

Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows
Peter Bourgonje | Julian Moreno Schneider | Georg Rehm | Felix Sasaki
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)

2014

pdf bib abs

As language resources start to become available in linked data formats, it becomes relevant to consider how linked data interoperability can play a role in active language processing workflows as well as for more static language resource publishing. This paper proposes that linked data may have a valuable role to play in tracking the use and generation of language resources in such workflows in order to assess and improve the performance of the language technologies that use the resources, based on feedback from the human involvement typically required within such processes. We refer to this as Active Curation of the language resources, since it is performed systematically over language processing workflows to continuously improve the quality of the resource in specific applications, rather than via dedicated curation steps. We use modern localisation workflows, i.e. assisted by machine translation and text analytics services, to explain how linked data can support such active curation. By referencing how a suitable linked data vocabulary can be assembled by combining existing linked data vocabularies and meta-data from other multilingual content processing annotations and tool exchange standards we aim to demonstrate the relative ease with which active curation can be deployed more broadly.

2013

pdf bib

Implementing ITS 2.0 for Post-editing Purposes
Celia Rico | Pedro L. Diez Orzas | Felix Sasaki
Proceedings of Machine Translation Summit XIV: User track

pdf bib

MATECAT: Machine Translation Enhanced Computer Assisted Translation META - Multilingual Europe Technology Alliance
Georg Rehm | Aljoscha Burchardt | Felix Sasaki
Proceedings of Machine Translation Summit XIV: European projects

pdf bib

META - Multilingual Europe Technology Alliance
Georg Rehm | Aljoscha Burchardt | Felix Sasaki
Proceedings of Machine Translation Summit XIV: European projects

2012

pdf bib abs

Evaluating the Impact of Phrase Recognition on Concept Tagging
Pablo Mendes | Joachim Daiber | Rohana Rajapakse | Felix Sasaki | Christian Bizer
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We have developed DBpedia Spotlight, a flexible concept tagging system that is able to annotate entities, topics and other terms in natural language text. The system starts by recognizing phrases to annotate in the input text, and subsequently disambiguates them to a reference knowledge base extracted from Wikipedia. In this paper we evaluate the impact of the phrase recognition step on the ability of the system to correctly reproduce the annotations of a gold standard in an unsupervised setting. We argue that a combination of techniques is needed, and we evaluate a number of alternatives according to an existing evaluation set.

2006

pdf bib abs

Work within the W3C Internationalization Activity and its Benefit for the Creation and Manipulation of Language Resources
Felix Sasaki
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper introduces ongoing and current work within Internationalization (i18n) Activity, in the World Wide Web Consortium (W3C). The focus is on aspects of the W3C i18n Activity which are of benefit for the creation and manipulation of multilingual language resources. In particular, the paper deals with ongoing work concerning encoding, visualization and processing of characters; current work on language and locale identification; and current work on internationalization of markup. The main usage scenario is the design of multilingual corpora. This includes issues of corpus creation and manipulation.

pdf bib

Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Andreas Witt | Gilles Sérasset | Susan Armstrong | Jim Breen | Ulrich Heid | Felix Sasaki
Proceedings of the Workshop on Multilingual Language Resources and Interoperability