Proceedings of Machine Translation Summit XIII: Tutorial Abstracts
- Anthology ID:
- 2011.mtsummit-tutorials
- Month:
- September 19
- Year:
- 2011
- Address:
- Xiamen, China
- Venue:
- MTSummit
- SIG:
- Publisher:
- URL:
- https://aclanthology.org/2011.mtsummit-tutorials/
- DOI:
Syntactic SMT and Semantic SMT
Dekai Wu
Over the past twenty years, we have attacked the historical methodological barriers between statistical machine translation and traditional models of syntax, semantics, and structure. In this tutorial, we will survey some of the central issues and techniques from each of these aspects, with an emphasis on `deeply theoretically integrated' models, rather than hybrid approaches such as superficial statistical aggregation or system combination of outputs produced by traditional symbolic components. On syntactic SMT, we will explore the trade-offs for SMT between learnability and representational expressiveness. After establishing a foundation in the theory and practice of stochastic transduction grammars, we will examine very recent new approaches to automatic unsupervised induction of various classes of transduction grammars. We will show why stochastic linear transduction grammars (LTGs and LITGs) and their preterminalized variants (PLITGs) are proving to be particularly intriguing models for the bootstrapping of inducing full-fledged stochastic inversion transduction grammars (ITGs). On semantic SMT, we will explore the trade-offs for SMT involved in applying various lexical semantics models. We will first examine word sense disambiguation, and discuss why traditional WSD models that are not deeply integrated within the SMT model tend, surprisingly, to fail. In contrast, we will show how a deeply embedded phrase sense disambiguation (PSD) approach succeeds where traditional WSD does not. We will then turn to semantic role labeling, and discuss the challenges of early approaches of applying SRL models to SMT. Finally, on semantic MT evaluation, we will explore some very new human and semi-automatic metrics based on semantic frame agreement. We show that by keeping the metrics deeply grounded within the theoretical framework of semantic frames, the new HMEANT and MEANT metrics can significantly outperform even the state-of-the-art expensive HTER and TER metrics, while at the same time maintaining the desirable characteristics of simplicity, inexpensiveness, and representational transparency.
From the Confidence Estimation of Machine Translation to the Integration of MT and Translation Memory
Yanjun Ma
|
Yifan He
|
Josef van Genabith
In this tutorial, we cover techniques that facilitate the integration of Machine Translation (MT) and Translation Memory (TM), which can help the adoption of MT technology in localisation industry. The tutorial covers four parts: i) brief introduction of MT and TM systems, ii) MT confidence estimation measures tailored for the TM environment, iii) segment-level MT and MT integration, iv) sub-segment level MT and TM integration, and v) human evaluation of MT and TM integration. We will first briefly describe and compare how translations are generated in MT and TM systems, and suggest possible avenues to combines these two systems. We will also cover current quality / cost estimation measures applied in MT and TM systems, such as the fuzzy-match score in the TM, and the evaluation/confidence metrics used to judge MT outputs. We then move on to introduce the recent developments in the field of MT confidence estimation tailored towards predicting post-editing efforts. We will especially focus on the confidence metrics proposed by Specia et al., which is shown to have high correlation with human preference, as well as post-editing time. For segment-level MT and TM integration, we present translation recommendation and translation re-ranking models, where the integration happens at the 1-best or the N-best level, respectively. Given an input to be translated, MT-TM recommendation compares the output from the MT and the TM systems, and presents the better one to the post-editor. MT-TM re-ranking, on the other hand, combines k-best lists from both systems, and generates a new list according to estimated post-editing effort. We observe high precision of these models in automatic and human evaluations, indicating that they can be integrated into TM environments without the risk of deteriorating the quality of the post-editing candidate. For sub-segment level MT and TM integration, we try to reuse high quality TM chunks to improve the quality of MT systems. We can also predict whether phrase pairs derived from fuzzy matches should be used to constrain the translation of an input segment. Using a series of linguistically- motivated features, our constraints lead both to more consistent translation output, and to improved translation quality, as is measured by automatic evaluation scores. Finally, we present several methodologies that can be used to track post-editing effort, perform human evaluation of MT-TM integration, or help translators to access MT outputs in a TM environment.
Evaluating the Output of Machine Translation Systems
Alon Lavie
This half-day tutorial provides a broad overview of how to evaluate translations that are produced by machine translation systems. The range of issues covered includes a broad survey of both human evaluation measures and commonly-used automated metrics, and a review of how these are used for various types of evaluation tasks, such as assessing the translation quality of MT-translated sentences, comparing the performance of alternative MT systems, or measuring the productivity gains of incorporating MT into translation workflows.
Productive Use of MT in Localization
Mirko Plitt
Localization is a term mainly used in the software industry to designate the adaptation of products to meet local market needs. At the center of this process lies the translation of the most visible part of the product – the user interface – and the product documentation. Not surprisingly, the localization industry has therefore long been an extensive consumer of translation technology and a key contributor to its progress. Software products are typically released in recurrent cycles, with large amounts of content remaining unchanged or undergoing only minor modifications from one release to the next. In addition, software development cycles are short, forcing translation to start while the product is still undergoing changes, so that localized products can reach global markets in a timely fashion. These two aspects result in a heavy dependency on the efficient handling of translation updates. It is only natural that the software industry turned to software-based productivity tools to automate the recycling of translations (through translation memories) and to support the management of the translation workflow (through translation management systems). Machine translation is a relatively recent addition to the localization technology mix, and not yet as widely adopted as one would expect. Its initial use in the software industry was for more accessory content which is otherwise often left untranslated, e.g. product support articles and antivirus alerts with their short lifecycle. The expectation had however always been that MT could one day be deployed on the bulk of user interface and product documentation, due to the expected process efficiencies and cost savings. While MT is generally still not considered “good” enough to be used raw on this type of content, it has now become an integral part of translation productivity environments, thereby transforming translators into post-editors. The tutorial will provide an overview of current localization practices and challenges, with a special focus on the role of translation memory and translation management technologies. As a use case of the integration of MT in such an environment, we will then present the approach taken by Autodesk with its large set of Moses engines trained on custom data. Finally, we will explore typical scenarios in which machine translation is employed in the localization industry, using practical examples and data gathered in different productivity and usability tests.