2024
pdf
bib
abs
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Mervat Abassy
|
Kareem Elozeiri
|
Alexander Aziz
|
Minh Ngoc Ta
|
Raj Vardhan Tomar
|
Bimarsha Adhikari
|
Saad El Dine Ahmed
|
Yuxia Wang
|
Osama Mohammed Afzal
|
Zhuohan Xie
|
Jonibek Mansurov
|
Ekaterina Artemova
|
Vladislav Mikhailov
|
Rui Xing
|
Jiahui Geng
|
Hasan Iqbal
|
Zain Muhammad Mujahid
|
Tarek Mahmoud
|
Akim Tsvigun
|
Alham Fikri Aji
|
Artem Shelmanov
|
Nizar Habash
|
Iryna Gurevych
|
Preslav Nakov
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present one such system, LLM-DetectAIve, designed for fine-grained detection. Unlike most previous work on machine-generated text detection, which focused on binary classification, LLM-DetectAIve supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished. Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education. Our experiments show that LLM-DetectAIve can effectively identify the above four categories, which makes it a potentially useful tool in education, academia, and other domains.LLM-DetectAIve is publicly accessible at https://github.com/mbzuai-nlp/LLM-DetectAIve. The video describing our system is available at https://youtu.be/E8eT_bE7k8c.
2022
pdf
bib
abs
Automatic Explanation Generation For Climate Science Claims
Rui Xing
|
Shraey Bhatia
|
Timothy Baldwin
|
Jey Han Lau
Proceedings of the 20th Annual Workshop of the Australasian Language Technology Association
Climate change is an existential threat to humanity, the proliferation of unsubstantiated claims relating to climate science is manipulating public perception, motivating the need for fact-checking in climate science. In this work, we draw on recent work that uses retrieval-augmented generation for veracity prediction and explanation generation, in framing explanation generation as a query-focused multi-document summarization task. We adapt PRIMERA to the climate science domain by adding additional global attention on claims. Through automatic evaluation and qualitative analysis, we demonstrate that our method is effective at generating explanations.
2019
pdf
bib
abs
Distant Supervised Relation Extraction with Separate Head-Tail CNN
Rui Xing
|
Jie Luo
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
Distant supervised relation extraction is an efficient and effective strategy to find relations between entities in texts. However, it inevitably suffers from mislabeling problem and the noisy data will hinder the performance. In this paper, we propose the Separate Head-Tail Convolution Neural Network (SHTCNN), a novel neural relation extraction framework to alleviate this issue. In this method, we apply separate convolution and pooling to the head and tail entity respectively for extracting better semantic features of sentences, and coarse-to-fine strategy to filter out instances which do not have actual relations in order to alleviate noisy data issues. Experiments on a widely used dataset show that our model achieves significant and consistent improvements in relation extraction compared to statistical and vanilla CNN-based methods.