2022
pdf
bib
abs
Upstream Mitigation Is Not All You Need: Testing the Bias Transfer Hypothesis in Pre-Trained Language Models
Ryan Steed
|
Swetasudha Panda
|
Ari Kobren
|
Michael Wick
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
A few large, homogenous, pre-trained models undergird many machine learning systems — and often, these models contain harmful stereotypes learned from the internet. We investigate the bias transfer hypothesis: the theory that social biases (such as stereotypes) internalized by large language models during pre-training transfer into harmful task-specific behavior after fine-tuning. For two classification tasks, we find that reducing intrinsic bias with controlled interventions before fine-tuning does little to mitigate the classifier’s discriminatory behavior after fine-tuning. Regression analysis suggests that downstream disparities are better explained by biases in the fine-tuning dataset. Still, pre-training plays a role: simple alterations to co-occurrence rates in the fine-tuning dataset are ineffective when the model has been pre-trained. Our results encourage practitioners to focus more on dataset quality and context-specific harms.
pdf
bib
abs
Don’t Just Clean It, Proxy Clean It: Mitigating Bias by Proxy in Pre-Trained Models
Swetasudha Panda
|
Ari Kobren
|
Michael Wick
|
Qinlan Shen
Findings of the Association for Computational Linguistics: EMNLP 2022
Transformer-based pre-trained models are known to encode societal biases not only in their contextual representations, but also in downstream predictions when fine-tuned on task-specific data.We present D-Bias, an approach that selectively eliminates stereotypical associations (e.g, co-occurrence statistics) at fine-tuning, such that the model doesn’t learn to excessively rely on those signals.D-Bias attenuates biases from both identity words and frequently co-occurring proxies, which we select using pointwise mutual information.We apply D-Bias to a) occupation classification, and b) toxicity classification and find that our approach substantially reduces downstream biases (e.g. by > 60% in toxicity classification, for identities that are most frequently flagged as toxic on online platforms).In addition, we show that D-Bias dramatically improves upon scrubbing, i.e., removing only the identity words in question.We also demonstrate that D-Bias easily extends to multiple identities, and achieves competitive performance with two recently proposed debiasing approaches: R-LACE and INLP.
2012
pdf
bib
Monte Carlo MCMC: Efficient Inference by Approximate Sampling
Sameer Singh
|
Michael Wick
|
Andrew McCallum
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
pdf
bib
A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Michael Wick
|
Sameer Singh
|
Andrew McCallum
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pdf
bib
Human-Machine Cooperation: Supporting User Corrections to Automatically Constructed KBs
Michael Wick
|
Karl Schultz
|
Andrew McCallum
Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)
pdf
bib
Monte Carlo MCMC: Efficient Inference by Sampling Factors
Sameer Singh
|
Michael Wick
|
Andrew McCallum
Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)
2008
pdf
bib
abs
A Corpus for Cross-Document Co-reference
David Day
|
Janet Hitzeman
|
Michael Wick
|
Keith Crouch
|
Massimo Poesio
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
This paper describes a newly created text corpus of news articles that has been annotated for cross-document co-reference. Being able to robustly resolve references to entities across document boundaries will provide a useful capability for a variety of tasks, ranging from practical information retrieval applications to challenging research in information extraction and natural language understanding. This annotated corpus is intended to encourage the development of systems that can more accurately address this problem. A manual annotation tool was developed that allowed the complete corpus to be searched for likely co-referring entity mentions. This corpus of 257K words links mentions of co-referent people, locations and organizations (subject to some additional constraints). Each of the documents had already been annotated for within-document co-reference by the LDC as part of the ACE series of evaluations. The annotation process was bootstrapped with a string-matching-based linking procedure, and we report on some of initial experimentation with the data. The cross-document linking information will be made publicly available.
2007
pdf
bib
First-Order Probabilistic Models for Coreference Resolution
Aron Culotta
|
Michael Wick
|
Andrew McCallum
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
2006
pdf
bib
Learning Field Compatibilities to Extract Database Records from Unstructured Text
Michael Wick
|
Aron Culotta
|
Andrew McCallum
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing