Hoang-Quoc Nguyen-Son

2026

SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine
Hoang-Quoc Nguyen-Son | Minh-Son Dao | Koji Zettsu
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

With the advent of large language models (LLMs), it has become common practice for users to draft text and utilize LLMs to enhance its quality through paraphrasing. However, this process can sometimes result in the loss or distortion of the original intended meaning. Due to the human-like quality of LLM-generated text, traditional detection methods often fail, particularly when text is paraphrased to closely mimic original content. In response to these challenges, we propose a novel approach named SearchLLM, designed to identify LLM-paraphrased text by leveraging search engine capabilities to locate potential original text sources. By analyzing similarities between the input and regenerated versions of candidate sources, SearchLLM effectively distinguishes LLM-paraphrased content. SearchLLM is designed as a proxy layer, allowing seamless integration with existing detectors to enhance their performance. Experimental results across various LLMs demonstrate that SearchLLM consistently enhances the accuracy of recent detectors in detecting LLM-paraphrased text that closely mimics original content. Furthermore, SearchLLM also helps the detectors prevent paraphrasing attacks.

2024

pdf bib abs

SimLLM: Detecting Sentences Generated by Large Language Models Using Similarity between the Generation and its Re-generation
Hoang-Quoc Nguyen-Son | Minh-Son Dao | Koji Zettsu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Large language models have emerged as a significant phenomenon due to their ability to produce natural text across various applications. However, the proliferation of generated text raises concerns regarding its potential misuse in fraudulent activities such as academic dishonesty, spam dissemination, and misinformation propagation. Prior studies have detected the generation of non-analogous text, which manifests numerous differences between original and generated text. We have observed that the similarity between the original text and its generation is notably higher than that between the generated text and its subsequent regeneration. To address this, we propose a novel approach named SimLLM, aimed at estimating the similarity between an input sentence and its generated counterpart to detect analogous machine-generated sentences that closely mimic human-written ones. Our empirical analysis demonstrates SimLLM’s superior performance compared to existing methods.

2023

pdf bib abs

VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard Labels of Transformations
Hoang-Quoc Nguyen-Son | Seira Hidano | Kazuhide Fukushima | Shinsaku Kiyomoto | Isao Echizen
Findings of the Association for Computational Linguistics: ACL 2023

Adversarial attacks reveal serious flaws in deep learning models. More dangerously, these attacks preserve the original meaning and escape human recognition. Existing methods for detecting these attacks need to be trained using original/adversarial data. In this paper, we propose detection without training by voting on hard labels from predictions of transformations, namely, VoteTRANS. Specifically, VoteTRANS detects adversarial text by comparing the hard labels of input text and its transformation. The evaluation demonstrates that VoteTRANS effectively detects adversarial text across various state-of-the-art attacks, models, and datasets.

2022

pdf bib abs

CheckHARD: Checking Hard Labels for Adversarial Text Detection, Prediction Correction, and Perturbed Word Suggestion
Hoang-Quoc Nguyen-Son | Huy Quang Ung | Seira Hidano | Kazuhide Fukushima | Shinsaku Kiyomoto
Findings of the Association for Computational Linguistics: EMNLP 2022

An adversarial attack generates harmful text that fools a target model. More dangerously, this text is unrecognizable by humans. Existing work detects adversarial text and corrects a target’s prediction by identifying perturbed words and changing them into their synonyms, but many benign words are also changed. In this paper, we directly detect adversarial text, correct the prediction, and suggest perturbed words by checking the change in the hard labels from the target’s predictions after replacing a word with its transformation using a model that we call CheckHARD. The experiments demonstrate that CheckHARD outperforms existing work on various attacks, models, and datasets.

2021

pdf bib

SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text
Hoang-Quoc Nguyen-Son | Seira Hidano | Kazuhide Fukushima | Shinsaku Kiyomoto
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib abs

Machine Translated Text Detection Through Text Similarity with Round-Trip Translation
Hoang-Quoc Nguyen-Son | Tran Thao | Seira Hidano | Ishita Gupta | Shinsaku Kiyomoto
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Translated texts have been used for malicious purposes, i.e., plagiarism or fake reviews. Existing detectors have been built around a specific translator (e.g., Google) but fail to detect a translated text from a strange translator. If we use the same translator, the translated text is similar to its round-trip translation, which is when text is translated into another language and translated back into the original language. However, a round-trip translated text is significantly different from the original text or a translated text using a strange translator. Hence, we propose a detector using text similarity with round-trip translation (TSRT). TSRT achieves 86.9% accuracy in detecting a translated text from a strange translator. It outperforms existing detectors (77.9%) and human recognition (53.3%).

2019

pdf bib abs

Detecting Machine-Translated Text using Back Translation
Hoang-Quoc Nguyen-Son | Thao Tran Phuong | Seira Hidano | Shinsaku Kiyomoto
Proceedings of the 12th International Conference on Natural Language Generation

Machine-translated text plays a crucial role in the communication of people using different languages. However, adversaries can use such text for malicious purposes such as plagiarism and fake review. The existing methods detected a machine-translated text only using the text’s intrinsic content, but they are unsuitable for classifying the machine-translated and human-written texts with the same meanings. We have proposed a method to extract features used to distinguish machine/human text based on the similarity between the intrinsic text and its back-translation. The evaluation of detecting translated sentences with French shows that our method achieves 75.0% of both accuracy and F-score. It outperforms the existing methods whose the best accuracy is 62.8% and the F-score is 62.7%. The proposed method even detects more efficiently the back-translated text with 83.4% of accuracy, which is higher than 66.7% of the best previous accuracy. We also achieve similar results not only with F-score but also with similar experiments related to Japanese. Moreover, we prove that our detector can recognize both machine-translated and machine-back-translated texts without the language information which is used to generate these machine texts. It demonstrates the persistence of our method in various applications in both low- and rich-resource languages.