This paper describes AISP-SJTU’s participation in WMT 2022 shared general MT task. In this shared task, we participated in four translation directions: English-Chinese, Chinese-English, English-Japanese and Japanese-English. Our systems are based on the Transformer architecture with several novel and effective variants, including network depth and internal structure. In our experiments, we employ data filtering, large-scale back-translation, knowledge distillation, forward-translation, iterative in-domain knowledge finetune and model ensemble. The constrained systems achieve 48.8, 29.7, 39.3 and 22.0 case-sensitive BLEU scores on EN-ZH, ZH-EN, EN-JA and JA-EN, respectively.
In logographic languages like Chinese, word meanings are constructed using specific character formations, which can help to disambiguate word senses and are beneficial for sentiment classification. However, such knowledge is rarely explored in previous sentiment analysis methods. In this paper, we focus on exploring the logographic information for aspect-based sentiment classification in Chinese text. Specifically, we employ a logographic image to capture an internal morphological structure from the character sequence. The logographic image is also used to learn the external relations among context and aspect words. Furthermore, we propose a multimodal language model to explicitly incorporate a logographic image with review text for aspect-based sentiment classification in Chinese. Experimental results show that our method brings substantial performance improvement over strong baselines. The results also indicate that the logographic image is very important for exploring the internal structure and external relations from the character sequence.