Martín Santillán Cooper
Also published as: Martin Santillan Cooper
2024
Human-Centered Design Recommendations for LLM-as-a-judge
Qian Pan
|
Zahra Ashktorab
|
Michael Desmond
|
Martín Santillán Cooper
|
James Johnson
|
Rahul Nair
|
Elizabeth Daly
|
Werner Geyer
Proceedings of the 1st Human-Centered Large Language Modeling Workshop
Traditional reference-based metrics, such as BLEU and ROUGE, are less effective for assessing outputs from Large Language Models (LLMs) that produce highly creative or superior-quality text, or in situations where reference outputs are unavailable. While human evaluation remains an option, it is costly and difficult to scale. Recent work using LLMs as evaluators (LLM-as-a-judge) is promising, but trust and reliability remain a significant concern. Integrating human input is crucial to ensure criteria used to evaluate are aligned with the human’s intent, and evaluations are robust and consistent. This paper presents a user study of a design exploration called EvaluLLM, that enables users to leverage LLMs as customizable judges, promoting human involvement to balance trust and cost-saving potential with caution. Through interviews with eight domain experts, we identified the need for assistance in developing effective evaluation criteria aligning the LLM-as-a-judge with practitioners’ preferences and expectations. We offer findings and design recommendations to optimize human-assisted LLM-as-judge systems.
2022
Label Sleuth: From Unlabeled Text to a Classifier in a Few Hours
Eyal Shnarch
|
Alon Halfon
|
Ariel Gera
|
Marina Danilevsky
|
Yannis Katsis
|
Leshem Choshen
|
Martin Santillan Cooper
|
Dina Epelboim
|
Zheng Zhang
|
Dakuo Wang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Label Sleuth is an open source platform for building text classifiers which does not require coding skills nor machine learning knowledge.- Project website: [https://www.label-sleuth.org/](https://www.label-sleuth.org/)- Link to screencast video: [https://vimeo.com/735675461](https://vimeo.com/735675461)### AbstractText classification can be useful in many real-world scenarios, saving a lot of time for end users. However, building a classifier generally requires coding skills and ML knowledge, which poses a significant barrier for many potential users. To lift this barrier we introduce *Label Sleuth*, a free open source system for labeling and creating text classifiers. This system is unique for: - being a no-code system, making NLP accessible for non-experts. - guiding its users throughout the entire labeling process until they obtain their desired classifier, making the process efficient - from cold start to a classifier in a few hours. - being open for configuration and extension by developers. By open sourcing Label Sleuth we hope to build a community of users and developers that will widen the utilization of NLP models.
Search
Co-authors
- Eyal Shnarch 1
- Alon Halfon 1
- Ariel Gera 1
- Marina Danilevsky 1
- Yannis Katsis 1
- show all...