IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages Mohammed Safi Ur Rahman Khan author Priyam Mehta author Ananth Sankar author Umashankar Kumaravelan author Sumanth Doddapaneni author Suriyaprasaad B author Varun G author Sparsh Jain author Anoop Kunchukuttan author Pratyush Kumar author Raj Dabre author Mitesh M Khapra author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication khan-etal-2024-indicllmsuite 10.18653/v1/2024.acl-long.843 https://aclanthology.org/2024.acl-long.843/ 2024-08 15831 15879