Bianka Buschbeck

Also published as: B. Buschbeck, Bianka Buschbeck-Wolf


2020

pdf bib
Incorporating External Annotation to improve Named Entity Translation in NMT
Maciej Modrzejewski | Miriam Exel | Bianka Buschbeck | Thanh-Le Ha | Alexander Waibel
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems. This study explores methods incorporating named entity recognition (NER) into NMT with the aim to improve named entity translation. It proposes an annotation method that integrates named entities and inside–outside–beginning (IOB) tagging into the neural network input with the use of source factors. Our experiments on English→German and English→ Chinese show that just by including different NE classes and IOB tagging, we can increase the BLEU score by around 1 point using the standard test set from WMT2019 and achieve up to 12% increase in NE translation rates over a strong baseline.

pdf bib
Terminology-Constrained Neural Machine Translation at SAP
Miriam Exel | Bianka Buschbeck | Lauritz Brandt | Simona Doneva
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

This paper examines approaches to bias a neural machine translation model to adhere to terminology constraints in an industrial setup. In particular, we investigate variations of the approach by Dinu et al. (2019), which uses inline annotation of the target terms in the source segment plus source factor embeddings during training and inference, and compare them to constrained decoding. We describe the challenges with respect to terminology in our usage scenario at SAP and show how far the investigated methods can help to overcome them. We extend the original study to a new language pair and provide an in-depth evaluation including an error classification and a human evaluation.

pdf bib
A Parallel Evaluation Data Set of Software Documentation with Document Structure Annotation
Bianka Buschbeck | Miriam Exel
Proceedings of the 7th Workshop on Asian Translation

This paper accompanies the software documentation data set for machine translation, a parallel evaluation data set of data originating from the SAP Help Portal, that we released to the machine translation community for research purposes. It offers the possibility to tune and evaluate machine translation systems in the domain of corporate software documentation and contributes to the availability of a wider range of evaluation scenarios. The data set comprises of the language pairs English to Hindi, Indonesian, Malay and Thai, and thus also increases the test coverage for the many low-resource language pairs. Unlike most evaluation data sets that consist of plain parallel text, the segments in this data set come with additional metadata that describes structural information of the document context. We provide insights into the origin and creation, the particularities and characteristics of the data set as well as machine translation results.

2013

pdf bib
Joint WMT 2013 Submission of the QUAERO Project
Stephan Peitz | Saab Mansour | Matthias Huck | Markus Freitag | Hermann Ney | Eunah Cho | Teresa Herrmann | Mohammed Mediani | Jan Niehues | Alex Waibel | Alexander Allauzen | Quoc Khanh Do | Bianka Buschbeck | Tonio Wandmacher
Proceedings of the Eighth Workshop on Statistical Machine Translation

2012

pdf bib
Joint WMT 2012 Submission of the QUAERO Project
Markus Freitag | Stephan Peitz | Matthias Huck | Hermann Ney | Jan Niehues | Teresa Herrmann | Alex Waibel | Hai-son Le | Thomas Lavergne | Alexandre Allauzen | Bianka Buschbeck | Josep Maria Crego | Jean Senellart
Proceedings of the Seventh Workshop on Statistical Machine Translation

2011

pdf bib
Joint WMT Submission of the QUAERO Project
Markus Freitag | Gregor Leusch | Joern Wuebker | Stephan Peitz | Hermann Ney | Teresa Herrmann | Jan Niehues | Alex Waibel | Alexandre Allauzen | Gilles Adda | Josep Maria Crego | Bianka Buschbeck | Tonio Wandmacher | Jean Senellart
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
Advances on spoken language translation in the Quaero program
Karim Boudahmane | Bianka Buschbeck | Eunah Cho | Josep Maria Crego | Markus Freitag | Thomas Lavergne | Hermann Ney | Jan Niehues | Stephan Peitz | Jean Senellart | Artem Sokolov | Alex Waibel | Tonio Wandmacher | Joern Wuebker | François Yvon
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign

The Quaero program is an international project promoting research and industrial innovation on technologies for automatic analysis and classification of multimedia and multilingual documents. Within the program framework, research organizations and industrial partners collaborate to develop prototypes of innovating applications and services for access and usage of multimedia data. One of the topics addressed is the translation of spoken language. Each year, a project-internal evaluation is conducted by DGA to monitor the technological advances. This work describes the design and results of the 2011 evaluation campaign. The participating partners were RWTH, KIT, LIMSI and SYSTRAN. Their approaches are compared on both ASR output and reference transcripts of speech data for the translation between French and German. The results show that the developed techniques further the state of the art and improve translation quality.

1998

pdf bib
Managing information at linguistic interfaces
Johan Bos | C.J. Rupp | Bianka Buschbeck-Wolf | Michael Dorna
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Managing Information at Linguistic Interfaces
Johan Bos | C.J. Rupp | Bianka Buschbeck-Wolf | Michael Dorna
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Quality and robustness in MTA balancing act
Bianka Buschbeck-Wolf | Michael Dorna
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers

The speech-to-speech translation system Verbmobil integrates deep and shallow analysis modules that produce linguistic representations in parallel. Thus, the input representations for the transfer module differ with respect to their depth and quality. This gives rise to two problems: (i) the transfer database has to be adjusted according to input quality, and (ii) translations produced have to be ranked with respect to their quality in order to select the most appropriate result. This paper presents an operationalized solution to both problems.

1996

pdf bib
Abstraction and underspecification in semantic transfer
Bernd Abb | Bianka Buschbeck-Wolf | Christel Tschernitschek
Conference of the Association for Machine Translation in the Americas

1991

pdf bib
Limits of a Sentence Based Procedural Approach for Aspect Choice in German-Russian MT
Bianka Buschbeck | Renate Henschel | Iris Hoser | Gerda Klimonow | Andreas Kustner | Ingrid Starke
Fifth Conference of the European Chapter of the Association for Computational Linguistics

1990

pdf bib
VIRTEX - a German-Russian Translation Experiment
B. Buschbeck | R. Henschel | I. Hoser | G. Klimonow | A. Kustner | I. Starke
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics