Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language Models

Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language Models Mingyang Song author Mao Zheng author Xuan Luo author 2025-01 text Proceedings of the 31st International Conference on Computational Linguistics Owen Rambow editor Leo Wanner editor Marianna Apidianaki editor Hend Al-Khalifa editor Barbara Di Eugenio editor Steven Schockaert editor Association for Computational Linguistics Abu Dhabi, UAE conference publication song-etal-2025-counting https://aclanthology.org/2025.coling-main.253/ 2025-01 3753 3763