Alexander Tuzhilin


2025

pdf bib
SENTRA: Selected-Next-Token Transformer for LLM Text Detection
Mitchell Plyler | Yilun Zhang | Alexander Tuzhilin | Saoud Khalifah | Sen Tian
Findings of the Association for Computational Linguistics: EMNLP 2025

LLMs are becoming increasingly capable and widespread. Consequently, the potential and reality of their misuse is also growing. In this work, we address the problem of detecting LLM-generated text that is not explicitly declared as such. We present a novel, general-purpose, and supervised LLM text detector, SElected-Next-Token tRAnsformer (SENTRA). SENTRA is a Transformer-based encoder leveraging selected-next-token-probability sequences and utilizing contrastive pre-training on large amounts of unlabeled data. Our experiments on three popular public datasets across 24 domains of text demonstrate SENTRA is a general-purpose classifier that significantly outperforms popular baselines in the out-of-domain setting.

2019

pdf bib
Towards Controllable and Personalized Review Generation
Pan Li | Alexander Tuzhilin
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In this paper, we propose a novel model RevGAN that automatically generates controllable and personalized user reviews based on the arbitrarily given sentimental and stylistic information. RevGAN utilizes the combination of three novel components, including self-attentive recursive autoencoders, conditional discriminators, and personalized decoders. We test its performance on the several real-world datasets, where our model significantly outperforms state-of-the-art generation models in terms of sentence quality, coherence, personalization, and human evaluations. We also empirically show that the generated reviews could not be easily distinguished from the organically produced reviews and that they follow the same statistical linguistics laws.