Yong Xu


2016

pdf bib
Lecture bilingue augmentée par des alignements multi-niveaux (Augmenting bilingual reading with alignment information)
François Yvon | Yong Xu | Marianna Apidianaki | Clément Pillias | Cubaud Pierre
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 5 : Démonstrations

Le travail qui a conduit à cette démonstration combine des outils de traitement des langues multilingues, en particulier l’alignement automatique, avec des techniques de visualisation et d’interaction. Il vise à proposer des pistes pour le développement d’outils permettant de lire simultanément les différentes versions d’un texte disponible en plusieurs langues, avec des applications en lecture de loisir ou en lecture professionnelle.

pdf bib
Novel elicitation and annotation schemes for sentential and sub-sentential alignments of bitexts
Yong Xu | François Yvon
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Resources for evaluating sentence-level and word-level alignment algorithms are unsatisfactory. Regarding sentence alignments, the existing data is too scarce, especially when it comes to difficult bitexts, containing instances of non-literal translations. Regarding word-level alignments, most available hand-aligned data provide a complete annotation at the level of words that is difficult to exploit, for lack of a clear semantics for alignment links. In this study, we propose new methodologies for collecting human judgements on alignment links, which have been used to annotate 4 new data sets, at the sentence and at the word level. These will be released online, with the hope that they will prove useful to evaluate alignment software and quality estimation tools for automatic alignment. Keywords: Parallel corpora, Sentence Alignments, Word Alignments, Confidence Estimation

pdf bib
TransRead: Designing a Bilingual Reading Experience with Machine Translation Technologies
François Yvon | Yong Xu | Marianna Apidianaki | Clément Pillias | Pierre Cubaud
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2015

pdf bib
Sentence alignment for literary texts: The state-of-the-art and beyond
Yong Xu | Aurélien Max | François Yvon
Linguistic Issues in Language Technology, Volume 12, 2015 - Literature Lifts up Computational Linguistics

Literary works are becoming increasingly available in electronic formats, thus quickly transforming editorial processes and reading habits. In the context of the global enthusiasm for multilingualism, the rapid spread of e-book readers, such as Amazon Kindle R or Kobo Touch R , fosters the development of a new generation of reading tools for bilingual books. In particular, literary works, when available in several languages, offer an attractive perspective for self-development or everyday leisure reading, but also for activities such as language learning, translation or literary studies. An important issue in the automatic processing of multilingual e-books is the alignment between textual units. Alignment could help identify corresponding text units in different languages, which would be particularly beneficial to bilingual readers and translation professionals. Computing automatic alignments for literary works, however, is a task more challenging than in the case of better behaved corpora such as parliamentary proceedings or technical manuals. In this paper, we revisit the problem of computing high-quality. alignment for literary works. We first perform a large-scale evaluation of automatic alignment for literary texts, which provides a fair assessment of the actual difficulty of this task. We then introduce a two-pass approach, based on a maximum entropy model. Experimental results for novels available in English and French or in English and Spanish demonstrate the effectiveness of our method.

2002

pdf bib
基於詞彙語義的百科辭典知識提取實驗 (An Experiment on Knowledge Extraction from an Encyclopedia Based on Lexicon Semantics) [In Chinese]
Rou Song | Yong Xu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational Chinese Lexical Semantics