Mitchell L Gordon

2024

pdf bib abs
StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements
Jillian Fisher | Skyler Hallinan | Ximing Lu | Mitchell L Gordon | Zaid Harchaoui | Yejin Choi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is important yet challenging. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall.To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite inputs along various stylistic axes (e.g., formality, length) while maintaining low computational costs. StyleRemix outperforms state-of-the-art baselines and much larger LLMs on an array of domains on both automatic and human evaluation.Additionally, we release AuthorMix, a large set of 30K high-quality, long-form texts from a diverse set of 14 authors and 4 domains, and DiSC, a parallel corpus of 1,500 texts spanning seven style axes in 16 unique directions.

Co-authors

Venues

emnlp1

Fix data