Charaka Vinayak Kumar
2025
No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size
Ashok Urlana
|
Charaka Vinayak Kumar
|
Bala Mallikarjunarao Garlapati
|
Ajeet Kumar Singh
|
Rahul Mishra
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Large language models (LLMs) are playing a pivotal role in deploying strategic use cases across a range of organizations, from large pan-continental companies to emerging startups. The issues and challenges involved in the successful utilization of LLMs can vary significantly depending on the size of the organization. It is important to study and discuss these pertinent issues of LLM adaptation with a focus on the scale of the industrial concerns and brainstorm possible solutions and prospective directions. Such a study has not been prominently featured in the current research literature. In this study, we adopt a threefold strategy: first, we conduct a case study with industry practitioners to formulate the key research questions; second, we examine existing industrial publications to address these questions; and finally, we provide a practical guide for industries to utilize LLMs more efficiently. We release the GitHub repository with the most recent papers in the field.
2024
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques
Ashok Urlana
|
Aditya Saibewar
|
Bala Mallikarjunarao Garlapati
|
Charaka Vinayak Kumar
|
Ajeet Singh
|
Srinivasa Rao Chalamala
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The Large Language Models (LLMs) exhibit remarkable ability to generate fluent content across a wide spectrum of user queries. However, this capability has raised concerns regarding misinformation and personal information leakage. In this paper, we present our methods for the SemEval2024 Task8, aiming to detect machine-generated text across various domains in both mono-lingual and multi-lingual contexts. Our study comprehensively analyzes various methods to detect machine-generated text, including statistical, neural, and pre-trained model approaches. We also detail our experimental setup and perform a in-depth error analysis to evaluate the effectiveness of these methods. Our methods obtain an accuracy of 86.9% on the test set of subtask-A mono and 83.7% for subtask-B. Furthermore, we also highlight the challenges and essential factors for consideration in future studies.