Jason Lucas


2025

pdf bib
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text
Ali Al Lawati | Jason Lucas | Prasenjit Mitra
Proceedings of the 31st International Conference on Computational Linguistics

Large Language Models (LLMs) have demonstrated remarkable performance in various NLP tasks, including semantic parsing, which translates natural language into formal code representations. However, the reverse operation, translating code into natural language, termed semantic captioning, has received less attention. This task is increasingly important as LLMs are integrated into platforms for code generation, security analysis, and educational purposes. In this paper, we focus on the captioning of SQL query (SQ2Text) to address the critical need for understanding and explaining SQL queries in an era where LLM-generated code poses potential security risks. We repurpose semantic parsing datasets for semantic captioning, specifically SQL2Text. To overcome the limited robustness of Text2SQL datasets for the reverse task, we introduce an iterative ICL prompt leveraging GPT-4o to generate multiple additional utterances. We conduct experiments across multiple in-context learning (ICL) methods, emphasizing smaller, more computationally efficient LLMs. Our findings demonstrate that leveraging the inherent graph properties of SQL for few-shot ICL sample selection significantly outperforms random selection by up to 39% on BLEU score and provides better results than alternative approaches. Dataset and codes are accessible.

2023

pdf bib
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
Dominik Macko | Robert Moro | Adaku Uchendu | Jason Lucas | Michiharu Yamashita | Matúš Pikuliak | Ivan Srba | Thai Le | Dongwon Lee | Jakub Simko | Maria Bielikova
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE, a novel benchmarking dataset for multilingual machine-generated text detection comprising of 74,081 authentic and machine-generated texts in 11 languages (ar, ca, cs, de, en, es, nl, pt, ru, uk, and zh) generated by 8 multilingual LLMs. Using this benchmark, we compare the performance of zero-shot (statistical and black-box) and fine-tuned detectors. Considering the multilinguality, we evaluate 1) how these detectors generalize to unseen languages (linguistically similar as well as dissimilar) and unseen LLMs and 2) whether the detectors improve their performance when trained on multiple languages.

pdf bib
Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation
Jason Lucas | Adaku Uchendu | Michiharu Yamashita | Jooyoung Lee | Shaurya Rohatgi | Dongwon Lee
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Recent ubiquity and disruptive impacts of large language models (LLMs) have raised concerns about their potential to be misused (*.i.e, generating large-scale harmful and misleading content*). To combat this emerging risk of LLMs, we propose a novel “***Fighting Fire with Fire***” (F3) strategy that harnesses modern LLMs’ generative and emergent reasoning capabilities to counter human-written and LLM-generated disinformation. First, we leverage GPT-3.5-turbo to synthesize authentic and deceptive LLM-generated content through paraphrase-based and perturbation-based prefix-style prompts, respectively. Second, we apply zero-shot in-context semantic reasoning techniques with cloze-style prompts to discern genuine from deceptive posts and news articles. In our extensive experiments, we observe GPT-3.5-turbo’s zero-shot superiority for both in-distribution and out-of-distribution datasets, where GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike the decline observed in previous customized and fine-tuned disinformation detectors. Our codebase and dataset are available at https://github.com/mickeymst/F3.

2022

pdf bib
Detecting False Claims in Low-Resource Regions: A Case Study of Caribbean Islands
Jason Lucas | Limeng Cui | Thai Le | Dongwon Lee
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

The COVID-19 pandemic has created threats to global health control. Misinformation circulated on social media and news outlets has undermined public trust towards Government and health agencies. This problem is further exacerbated in developing countries or low-resource regions, where the news is not equipped with abundant English fact-checking information. In this paper, we make the first attempt to detect COVID-19 misinformation (in English, Spanish, and Haitian French) populated in the Caribbean regions, using the fact-checked claims in the US (in English). We started by collecting a dataset of Caribbean real & fake claims. Then we trained several classification and language models on COVID-19 in the high-resource language regions and transferred the knowledge to the Caribbean claim dataset. The experimental results of this paper reveal the limitations of current fake claim detection in low-resource regions and encourage further research on multi-lingual detection.