Dongji Feng
2023
Zero-Shot Multi-Label Topic Inference with Sentence Encoders and LLMs
Souvika Sarkar
|
Dongji Feng
|
Shubhra Kanti Karmaker Santu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
In this paper, we conducted a comprehensive study with the latest Sentence Encoders and Large Language Models (LLMs) on the challenging task of “definition-wild zero-shot topic inference”, where users define or provide the topics of interest in real-time. Through extensive experimentation on seven diverse data sets, we observed that LLMs, such as ChatGPT-3.5 and PaLM, demonstrated superior generality compared to other LLMs, e.g., BLOOM and GPT-NeoX. Furthermore, Sentence-BERT, a BERT-based classical sentence encoder, outperformed PaLM and achieved performance comparable to ChatGPT-3.5.
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks
Shubhra Kanti Karmaker Santu
|
Dongji Feng
Findings of the Association for Computational Linguistics: EMNLP 2023
While LLMs have shown great success in understanding and generating text in traditional conversational settings, their potential for performing ill-defined complex tasks is largely under-studied and yet to be benchmarked. However, conducting such benchmarking studies is challenging because of the large variations in LLMs’ performance when different prompt types/styles are used and different degrees of detail are provided in the prompts. To address this issue, this paper proposes a general taxonomy that can be used to design prompts with specific properties in order to perform a wide range of complex tasks. This taxonomy will allow future benchmarking studies to report the specific categories of prompts used as part of the study, enabling meaningful comparisons across different studies. Also, by establishing a common standard through this taxonomy, researchers will be able to draw more accurate conclusions about LLMs’ performance on a specific complex task.
2022
Exploring Universal Sentence Encoders for Zero-shot Text Classification
Souvika Sarkar
|
Dongji Feng
|
Shubhra Kanti Karmaker Santu
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Universal Sentence Encoder (USE) has gained much popularity recently as a general-purpose sentence encoding technique. As the name suggests, USE is designed to be fairly general and has indeed been shown to achieve superior performances for many downstream NLP tasks. In this paper, we present an interesting “negative” result on USE in the context of zero-shot text classification, a challenging task, which has recently gained much attraction. More specifically, we found some interesting cases of zero-shot text classification, where topic based inference outperformed USE-based inference in terms of F1 score. Further investigation revealed that USE struggles to perform well on data-sets with a large number of labels with high semantic overlaps, while topic-based classification works well for the same.