Authorship obfuscation techniques hold the promise of helping people protect their privacy in online communications by automatically rewriting text to hide the identity of the original author. However, obfuscation has been evaluated in narrow settings in the NLP literature and has primarily been addressed with superficial edit operations that can lead to unnatural outputs. In this work, we introduce an automatic text privatization framework that fine-tunes a large language model via reinforcement learning to produce rewrites that balance soundness, sense, and privacy. We evaluate it extensively on a large-scale test set of English Reddit posts by 68k authors composed of short-medium length texts. We study how the performance changes among evaluative conditions including authorial profile length and authorship detection strategy. Our method maintains high text quality according to both automated metrics and human evaluation, and successfully evades several automated authorship attacks.
As the quality of AI-generated text increases with the development of new Large Language Models, people use them to write in a variety of contexts. Human-AI collaborative writing poses a potential challenge for existing AI analysis techniques, which have been primarily tested either on human-written text only, or on samples independently generated by humans and AI. In this work, we investigate the extent to which existing AI detection and authorship analysis models can perform classification on data generated in human-AI collaborative writing sessions. Results show that, for AI text detection in the cowriting setting, classifiers based on authorship embeddings (Rivera-Soto et al., 2021) outperform classifiers used in prior work distinguishing AI vs. human text generated independently. However, these embeddings are not optimal for finer-grained authorship identification tasks: for authorship verification, n-gram based models are more robust to human-AI co-written text, and authorship attribution performance degrades compared to baselines that use human-written text only. Taken together, this suggests that the rise of human-AI co-written text will require adapting AI detection tools and authorship analysis techniques in the near future. We release our code at https://github.com/AARichburg/Human-AI_Authorship_Analysis.
This paper describes the University of Maryland’s submissions to the WMT20 Shared Task on Chat Translation. We focus on translating agent-side utterances from English to German. We started from an off-the-shelf BPE-based standard transformer model trained with WMT17 news and fine-tuned it with the provided in-domain training data. In addition, we augment the training set with its best matches in the WMT19 news dataset. Our primary submission uses a standard Transformer, while our contrastive submissions use multi-encoder Transformers to attend to previous utterances. Our primary submission achieves 56.7 BLEU on the agent side (en→de), outperforming a baseline system provided by the task organizers by more than 13 BLEU points. Moreover, according to an evaluation on a set of carefully-designed examples, the multi-encoder architecture is able to generate more coherent translations.