Jiliang Tang


pdf bib
Evaluating and Mitigating Inherent Linguistic Bias of African American English through Inference
Jamell Dacon | Haochen Liu | Jiliang Tang
Proceedings of the 29th International Conference on Computational Linguistics

Recent studies show that NLP models trained on standard English texts tend to produce biased outcomes against underrepresented English varieties. In this work, we conduct a pioneering study of the English variety use of African American English (AAE) in NLI task. First, we propose CodeSwitch, a greedy unidirectional morphosyntactically-informed rule-based translation method for data augmentation. Next, we use CodeSwitch to present a preliminary study to determine if demographic language features do in fact influence models to produce false predictions. Then, we conduct experiments on two popular datasets and propose two simple, yet effective and generalizable debiasing methods. Our findings show that NLI models (e.g. BERT) trained under our proposed frameworks outperform traditional large language models while maintaining or even improving the prediction performance. In addition, we intend to release CodeSwitch, in hopes of promoting dialectal language diversity in training data to both reduce the discriminatory societal impacts and improve model robustness of downstream NLP tasks.

pdf bib
Toward Annotator Group Bias in Crowdsourcing
Haochen Liu | Joseph Thekinen | Sinem Mollaoglu | Da Tang | Ji Yang | Youlong Cheng | Hui Liu | Jiliang Tang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Crowdsourcing has emerged as a popular approach for collecting annotated data to train supervised machine learning models. However, annotator bias can lead to defective annotations. Though there are a few works investigating individual annotator bias, the group effects in annotators are largely overlooked. In this work, we reveal that annotators within the same demographic group tend to show consistent group bias in annotation tasks and thus we conduct an initial study on annotator group bias. We first empirically verify the existence of annotator group bias in various real-world crowdsourcing datasets. Then, we develop a novel probabilistic graphical framework GroupAnno to capture annotator group bias with an extended Expectation Maximization (EM) algorithm. We conduct experiments on both synthetic and real-world datasets. Experimental results demonstrate the effectiveness of our model in modeling annotator group bias in label aggregation and model learning over competitive baselines.


pdf bib
The Authors Matter: Understanding and Mitigating Implicit Bias in Deep Text Classification
Haochen Liu | Wei Jin | Hamid Karimi | Zitao Liu | Jiliang Tang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021


pdf bib
Personalized Multimodal Feedback Generation in Education
Haochen Liu | Zitao Liu | Zhongqin Wu | Jiliang Tang
Proceedings of the 28th International Conference on Computational Linguistics

The automatic feedback of school assignments is an important application of AI in education. In this work, we focus on the task of personalized multimodal feedback generation, which aims to generate personalized feedback for teachers to evaluate students’ assignments involving multimodal inputs such as images, audios, and texts. This task involves the representation and fusion of multimodal information and natural language generation, which presents the challenges from three aspects: (1) how to encode and integrate multimodal inputs; (2) how to generate feedback specific to each modality; and (3) how to fulfill personalized feedback generation. In this paper, we propose a novel Personalized Multimodal Feedback Generation Network (PMFGN) armed with a modality gate mechanism and a personalized bias mechanism to address these challenges. Extensive experiments on real-world K-12 education data show that our model significantly outperforms baselines by generating more accurate and diverse feedback. In addition, detailed ablation experiments are conducted to deepen our understanding of the proposed framework.

pdf bib
Does Gender Matter? Towards Fairness in Dialogue Systems
Haochen Liu | Jamell Dacon | Wenqi Fan | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 28th International Conference on Computational Linguistics

Recently there are increasing concerns about the fairness of Artificial Intelligence (AI) in real-world applications such as computer vision and recommendations. For example, recognition algorithms in computer vision are unfair to black people such as poorly detecting their faces and inappropriately identifying them as “gorillas”. As one crucial application of AI, dialogue systems have been extensively applied in our society. They are usually built with real human conversational data; thus they could inherit some fairness issues which are held in the real world. However, the fairness of dialogue systems has not been well investigated. In this paper, we perform a pioneering study about the fairness issues in dialogue systems. In particular, we construct a benchmark dataset and propose quantitative measures to understand fairness in dialogue models. Our studies demonstrate that popular dialogue models show significant prejudice towards different genders and races. Besides, to mitigate the bias in dialogue systems, we propose two simple but effective debiasing methods. Experiments show that our methods can reduce the bias in dialogue systems significantly. The dataset and the implementation are released to foster fairness research in dialogue systems.

pdf bib
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning
Haochen Liu | Wentao Wang | Yiqi Wang | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Dialogue systems play an increasingly important role in various aspects of our daily life. It is evident from recent research that dialogue systems trained on human conversation data are biased. In particular, they can produce responses that reflect people’s gender prejudice. Many debiasing methods have been developed for various NLP tasks, such as word embedding. However, they are not directly applicable to dialogue systems because they are likely to force dialogue models to generate similar responses for different genders. This greatly degrades the diversity of the generated responses and immensely hurts the performance of the dialogue models. In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Extensive experiments on two real-world conversation datasets show that our framework significantly reduces gender bias in dialogue models while maintaining the response quality.


pdf bib
Learning Hierarchical Discourse-level Structure for Fake News Detection
Hamid Karimi | Jiliang Tang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

On the one hand, nowadays, fake news articles are easily propagated through various online media platforms and have become a grand threat to the trustworthiness of information. On the other hand, our understanding of the language of fake news is still minimal. Incorporating hierarchical discourse-level structure of fake and real news articles is one crucial step toward a better understanding of how these articles are structured. Nevertheless, this has rarely been investigated in the fake news detection domain and faces tremendous challenges. First, existing methods for capturing discourse-level structure rely on annotated corpora which are not available for fake news datasets. Second, how to extract out useful information from such discovered structures is another challenge. To address these challenges, we propose Hierarchical Discourse-level Structure for Fake news detection. HDSF learns and constructs a discourse-level structure for fake/real news articles in an automated and data-driven manner. Moreover, we identify insightful structure-related properties, which can explain the discovered structures and boost our understating of fake news. Conducted experiments show the effectiveness of the proposed approach. Further structural analysis suggests that real and fake news present substantial differences in the hierarchical discourse-level structures.


pdf bib
Multi-Source Multi-Class Fake News Detection
Hamid Karimi | Proteek Roy | Sari Saba-Sadiya | Jiliang Tang
Proceedings of the 27th International Conference on Computational Linguistics

Fake news spreading through media outlets poses a real threat to the trustworthiness of information and detecting fake news has attracted increasing attention in recent years. Fake news is typically written intentionally to mislead readers, which determines that fake news detection merely based on news content is tremendously challenging. Meanwhile, fake news could contain true evidence to mock true news and presents different degrees of fakeness, which further exacerbates the detection difficulty. On the other hand, the spread of fake news produces various types of data from different perspectives. These multiple sources provide rich contextual information about fake news and offer unprecedented opportunities for advanced fake news detection. In this paper, we study fake news detection with different degrees of fakeness by integrating multiple sources. In particular, we introduce approaches to combine information from multiple sources and to discriminate between different degrees of fakeness, and propose a Multi-source Multi-class Fake news Detection framework MMFD, which combines automated feature extraction, multi-source fusion and automated degrees of fakeness detection into a coherent and interpretable model. Experimental results on the real-world data demonstrate the effectiveness of the proposed framework and extensive experiments are further conducted to understand the working of the proposed framework.