What Happens to a Dataset Transformed by a Projection-based Concept Removal Method?

Richard Johansson

What Happens to a Dataset Transformed by a Projection-based Concept Removal Method?

Abstract

We investigate the behavior of methods using linear projections to remove information about a concept from a language representation, and we consider the question of what happens to a dataset transformed by such a method. A theoretical analysis and experiments on real-world and synthetic data show that these methods inject strong statistical dependencies into the transformed datasets. After applying such a method, the representation space is highly structured: in the transformed space, an instance tends to be located near instances of the opposite label. As a consequence, the original labeling can in some cases be reconstructed by applying an anti-clustering method.

Anthology ID:: 2024.lrec-main.1520
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 17486–17492
Language:
URL:: https://aclanthology.org/2024.lrec-main.1520/
DOI:
Bibkey:
Cite (ACL):: Richard Johansson. 2024. What Happens to a Dataset Transformed by a Projection-based Concept Removal Method?. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17486–17492, Torino, Italia. ELRA and ICCL.
Cite (Informal):: What Happens to a Dataset Transformed by a Projection-based Concept Removal Method? (Johansson, LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.1520.pdf

PDF Cite Search Fix data