Yang Zhou

2024

pdf bib
Diffusion Models for Sign Language Video Anonymization
Zhaoyang Xia | Yang Zhou | Ligong Han | Carol Neidle | Dimitris N. Metaxas
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources

pdf bib
A Multimodal Spatio-Temporal GCN Model with Enhancements for Isolated Sign Recognition
Yang Zhou | Zhaoyang Xia | Yuxiao Chen | Carol Neidle | Dimitris N. Metaxas
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources

2023

pdf bib abs
Exploring the Effectiveness of Multi-Lingual Commonsense Knowledge-Aware Open-Domain Dialogue Response Generation
Sixing Wu | Jiong Yu | Tianshi Che | Yang Zhou | Wei Zhou
Findings of the Association for Computational Linguistics: EMNLP 2023

Prior works have shown the promising results of commonsense knowledge-aware models in improving informativeness while reducing the hallucination issue. Nonetheless, prior works often can only use monolingual knowledge whose language is consistent with the dialogue context. Except for a few high-resource languages, such as English and Chinese, most languages suffer from insufficient knowledge issues, especially minority languages. To this end, this work proposes a new task, Multi-Lingual Commonsense Knowledge-Aware Response Generation (MCKRG), which tries to use commonsense knowledge in other languages to enhance the current dialogue generation. Then, we construct a MCKRG dataset MCK-Dialog of seven languages with multiple alignment methods. Finally, we verify the effectiveness of using multi-lingual commonsense knowledge with a proposed MCK-T5 model. Extensive experimental results demonstrate the great potential of using multi-lingual commonsense knowledge in high-resource and low-resource languages. To the best of our knowledge, this work is the first to explore Multi-Lingual Commonsense Knowledge-Aware Response Generation.

Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. However, the training process of Large Language Models (LLMs) generally incurs the update of significant parameters, which limits the applicability of FL techniques to tackle the LLMs in real scenarios. Prompt tuning can significantly reduce the number of parameters to update, but it either incurs performance degradation or low training efficiency. The straightforward utilization of prompt tuning in the FL often raises non-trivial communication costs and dramatically degrades performance. In addition, the decentralized data is generally non-Independent and Identically Distributed (non-IID), which brings client drift problems and thus poor performance. This paper proposes a Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs. First, an efficient partial prompt tuning approach is proposed to improve performance and efficiency simultaneously. Second, a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further. Extensive experiments based on 10 datasets demonstrate the superb performance (up to 60.8% in terms of accuracy) and efficiency (up to 97.59% in terms of training time) of FedPepTAO compared with 9 baseline approaches. Our code is available at https://github.com/llm-eff/FedPepTAO.

pdf bib abs
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Sicong Leng | Yang Zhou | Mohammed Haroon Dupty | Wee Sun Lee | Sam Joyce | Wei Lu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We consider the task of generating designs directly from natural language descriptions, and consider floor plan generation as the initial research area. Language conditional generative models have recently been very successful in generating high-quality artistic images. However, designs must satisfy different constraints that are not present in generating artistic images, particularly spatial and relational constraints. We make multiple contributions to initiate research on this task. First, we introduce a novel dataset, Tell2Design (T2D), which contains more than 80k floor plan designs associated with natural language instructions. Second, we propose a Sequence-to-Sequence model that can serve as a strong baseline for future research. Third, we benchmark this task with several text-conditional image generation models. We conclude by conducting human evaluations on the generated samples and providing an analysis of human performance. We hope our contributions will propel the research on language-guided design generation forward.

2022

This paper describes AISP-SJTU’s submissions for the IWSLT 2022 Simultaneous Translation task. We participate in the text-to-text and speech-to-text simultaneous translation from English to Mandarin Chinese. The training of the CAAT is improved by training across multiple values of right context window size, which achieves good online performance without setting a prior right context window size for training. For speech-to-text task, the best model we submitted achieves 25.87, 26.21, 26.45 BLEU in low, medium and high regimes on tst-COMMON, corresponding to 27.94, 28.31, 28.43 BLEU in text-to-text task.

2021

pdf bib abs
More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge
Sixing Wu | Ying Li | Minghui Wang | Dawei Zhang | Yang Zhou | Zhonghai Wu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Despite achieving remarkable performance, previous knowledge-enhanced works usually only use a single-source homogeneous knowledge base of limited knowledge coverage. Thus, they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries. This paper proposes a novel dialogue generation model, MSKE-Dialog, to solve this issue with three unique advantages: (1) Rather than only one, MSKE-Dialog can simultaneously leverage multiple heterogeneous knowledge sources (it includes but is not limited to commonsense knowledge facts, text knowledge, infobox knowledge) to improve the knowledge coverage; (2) To avoid the topic conflict among the context and different knowledge sources, we propose a Multi-Reference Selection to better select context/knowledge; (3) We propose a Multi-Reference Generation to generate informative responses by referring to multiple generation references at the same time. Extensive evaluations on a Chinese dataset show the superior performance of this work against various state-of-the-art approaches. To our best knowledge, this work is the first to use the multi-source heterogeneous knowledge in the open-domain knowledge-enhanced dialogue generation.

Recent literatures have shown that knowledge graph (KG) learning models are highly vulnerable to adversarial attacks. However, there is still a paucity of vulnerability analyses of cross-lingual entity alignment under adversarial attacks. This paper proposes an adversarial attack model with two novel attack techniques to perturb the KG structure and degrade the quality of deep cross-lingual entity alignment. First, an entity density maximization method is employed to hide the attacked entities in dense regions in two KGs, such that the derived perturbations are unnoticeable. Second, an attack signal amplification method is developed to reduce the gradient vanishing issues in the process of adversarial attacks for further improving the attack effectiveness.

2020

pdf bib abs
A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction
Yang Zhou | Tong Zhao | Meng Jiang
Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)

Textual patterns (e.g., Country’s president Person) are specified and/or generated for extracting factual information from unstructured data. Pattern-based information extraction methods have been recognized for their efficiency and transferability. However, not every pattern is reliable: A major challenge is to derive the most complete and accurate facts from diverse and sometimes conflicting extractions. In this work, we propose a probabilistic graphical model which formulates fact extraction in a generative process. It automatically infers true facts and pattern reliability without any supervision. It has two novel designs specially for temporal facts: (1) it models pattern reliability on two types of time signals, including temporal tag in text and text generation time; (2) it models commonsense constraints as observable variables. Experimental results demonstrate that our model significantly outperforms existing methods on extracting true temporal facts from news data.

pdf bib abs
Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness
Sixing Wu | Ying Li | Dawei Zhang | Yang Zhou | Zhonghai Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generative dialogue systems tend to produce generic responses, which often leads to boring conversations. For alleviating this issue, Recent studies proposed to retrieve and introduce knowledge facts from knowledge graphs. While this paradigm works to a certain extent, it usually retrieves knowledge facts only based on the entity word itself, without considering the specific dialogue context. Thus, the introduction of the context-irrelevant knowledge facts can impact the quality of generations. To this end, this paper proposes a novel commonsense knowledge-aware dialogue generation model, ConKADI. We design a Felicitous Fact mechanism to help the model focus on the knowledge facts that are highly relevant to the context; furthermore, two techniques, Context-Knowledge Fusion and Flexible Mode Fusion are proposed to facilitate the integration of the knowledge in the ConKADI. We collect and build a large-scale Chinese dataset aligned with the commonsense knowledge for dialogue generation. Extensive evaluations over both an open-released English dataset and our Chinese dataset demonstrate that our approach ConKADI outperforms the state-of-the-art approach CCM, in most experiments.