Ziming Cheng


pdf bib
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training
Yan Zeng | Wangchunshu Zhou | Ao Luo | Ziming Cheng | Xinsong Zhang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre-training with shared architectures and objectives. Our approach is motivated by a key observation that cross-lingual and cross-modal pre-training share the same goal of aligning two different views of the same object into a common semantic space. To this end, the cross-view language modeling framework considers both multi-modal data (i.e., image-caption pairs) and multi-lingual data (i.e., parallel sentence pairs) as two different views of the same object, and trains the model to align the two views by maximizing the mutual information between them with conditional masked language modeling and contrastive learning. We pre-train CCLM, a Cross-lingual Cross-modal Language Model, with the cross-view language modeling framework. Empirical results on IGLUE, a multi-lingual multi-modal benchmark, and two multi-lingual image-text retrieval datasets show that while conceptually simpler, CCLM significantly outperforms the prior state-of-the-art with an average absolute improvement of over 10%. Moreover, CCLM is the first multi-lingual multi-modal pre-trained model that surpasses the translate-test performance of representative English vision-language models by zero-shot cross-lingual transfer.


pdf bib
BiBL: AMR Parsing and Generation with Bidirectional Bayesian Learning
Ziming Cheng | Zuchao Li | Hai Zhao
Proceedings of the 29th International Conference on Computational Linguistics

Abstract Meaning Representation (AMR) offers a unified semantic representation for natural language sentences. Thus transformation between AMR and text yields two transition tasks in opposite directions, i.e., Text-to-AMR parsing and AMR-to-Text generation. Existing AMR studies only focus on one-side improvements despite the duality of the two tasks, and their improvements are greatly attributed to the inclusion of large extra training data or complex structure modifications which harm the inference speed. Instead, we propose data-efficient Bidirectional Bayesian learning (BiBL) to facilitate bidirectional information transition by adopting a single-stage multitasking strategy so that the resulting model may enjoy much lighter training at the same time. Evaluation on benchmark datasets shows that our proposed BiBL outperforms strong previous seq2seq refinements without the help of extra data which is indispensable in existing counterpart models. We release the codes of BiBL at: https://github.com/KHAKhazeus/BiBL.