Niklas Muennighoff
2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
|
Rahmad Mahendra
|
Salsabil Maulana Akbar
|
Lester James Validad Miranda
|
Jennifer Santoso
|
Elyanah Aco
|
Akhdan Fadhilah
|
Jonibek Mansurov
|
Joseph Marvin Imperial
|
Onno P. Kampman
|
Joel Ruben Antony Moniz
|
Muhammad Ravi Shulthan Habibi
|
Frederikus Hudi
|
Jann Railey Montalan
|
Ryan Ignatius Hadiwijaya
|
Joanito Agili Lopo
|
William Nixon
|
Börje F. Karlsson
|
James Jaya
|
Ryandito Diandaru
|
Yuze Gao
|
Patrick Amadeus Irawan
|
Bin Wang
|
Jan Christian Blaise Cruz
|
Chenxi Whitehouse
|
Ivan Halim Parmonangan
|
Maria Khelli
|
Wenyu Zhang
|
Lucky Susanto
|
Reynard Adha Ryanda
|
Sonny Lazuardi Hermawan
|
Dan John Velasco
|
Muhammad Dehan Al Kautsar
|
Willy Fitra Hendria
|
Yasmin Moslem
|
Noah Flynn
|
Muhammad Farid Adilazuarda
|
Haochen Li
|
Johanes Lee
|
R. Damanhuri
|
Shuo Sun
|
Muhammad Reza Qorib
|
Amirbek Djanibekov
|
Wei Qi Leong
|
Quyet V. Do
|
Niklas Muennighoff
|
Tanrada Pansuwan
|
Ilham Firdausi Putra
|
Yan Xu
|
Tai Ngee Chia
|
Ayu Purwarianti
|
Sebastian Ruder
|
William Chandra Tjhi
|
Peerat Limkonchotiwat
|
Alham Fikri Aji
|
Sedrick Keh
|
Genta Indra Winata
|
Ruochen Zhang
|
Fajri Koto
|
Zheng Xin Yong
|
Samuel Cahyawijaya
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Shivalika Singh
|
Freddie Vargus
|
Daniel D’souza
|
Börje Karlsson
|
Abinaya Mahendiran
|
Wei-Yin Ko
|
Herumb Shandilya
|
Jay Patel
|
Deividas Mataciunas
|
Laura O’Mahony
|
Mike Zhang
|
Ramith Hettiarachchi
|
Joseph Wilson
|
Marina Machado
|
Luisa Moura
|
Dominik Krzemiński
|
Hakimeh Fadaei
|
Irem Ergun
|
Ifeoma Okoh
|
Aisha Alaagib
|
Oshan Mudannayake
|
Zaid Alyafeai
|
Vu Chien
|
Sebastian Ruder
|
Surya Guthikonda
|
Emad Alghamdi
|
Sebastian Gehrmann
|
Niklas Muennighoff
|
Max Bartolo
|
Julia Kreutzer
|
Ahmet Üstün
|
Marzieh Fadaee
|
Sara Hooker
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini
|
Rodney Kinney
|
Akshita Bhagia
|
Dustin Schwenk
|
David Atkinson
|
Russell Authur
|
Ben Bogin
|
Khyathi Chandu
|
Jennifer Dumas
|
Yanai Elazar
|
Valentin Hofmann
|
Ananya Jha
|
Sachin Kumar
|
Li Lucy
|
Xinxi Lyu
|
Nathan Lambert
|
Ian Magnusson
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Abhilasha Ravichander
|
Kyle Richardson
|
Zejiang Shen
|
Emma Strubell
|
Nishant Subramani
|
Oyvind Tafjord
|
Evan Walsh
|
Luke Zettlemoyer
|
Noah Smith
|
Hannaneh Hajishirzi
|
Iz Beltagy
|
Dirk Groeneveld
|
Jesse Dodge
|
Kyle Lo
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
|
Iz Beltagy
|
Evan Walsh
|
Akshita Bhagia
|
Rodney Kinney
|
Oyvind Tafjord
|
Ananya Jha
|
Hamish Ivison
|
Ian Magnusson
|
Yizhong Wang
|
Shane Arora
|
David Atkinson
|
Russell Authur
|
Khyathi Chandu
|
Arman Cohan
|
Jennifer Dumas
|
Yanai Elazar
|
Yuling Gu
|
Jack Hessel
|
Tushar Khot
|
William Merrill
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Valentina Pyatkin
|
Abhilasha Ravichander
|
Dustin Schwenk
|
Saurabh Shah
|
William Smith
|
Emma Strubell
|
Nishant Subramani
|
Mitchell Wortsman
|
Pradeep Dasigi
|
Nathan Lambert
|
Kyle Richardson
|
Luke Zettlemoyer
|
Jesse Dodge
|
Kyle Lo
|
Luca Soldaini
|
Noah Smith
|
Hannaneh Hajishirzi
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
|
Viraat Aryabumi
|
Zheng Yong
|
Wei-Yin Ko
|
Daniel D’souza
|
Gbemileke Onilude
|
Neel Bhandari
|
Shivalika Singh
|
Hui-Lee Ooi
|
Amr Kayid
|
Freddie Vargus
|
Phil Blunsom
|
Shayne Longpre
|
Niklas Muennighoff
|
Marzieh Fadaee
|
Julia Kreutzer
|
Sara Hooker
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2023
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Zheng Xin Yong
|
Hailey Schoelkopf
|
Niklas Muennighoff
|
Alham Fikri Aji
|
David Ifeoluwa Adelani
|
Khalid Almubarak
|
M Saiful Bari
|
Lintang Sutawika
|
Jungo Kasai
|
Ahmed Baruwa
|
Genta Winata
|
Stella Biderman
|
Edward Raff
|
Dragomir Radev
|
Vassilina Nikoulina
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Crosslingual Generalization through Multitask Finetuning
Niklas Muennighoff
|
Thomas Wang
|
Lintang Sutawika
|
Adam Roberts
|
Stella Biderman
|
Teven Le Scao
|
M Saiful Bari
|
Sheng Shen
|
Zheng Xin Yong
|
Hailey Schoelkopf
|
Xiangru Tang
|
Dragomir Radev
|
Alham Fikri Aji
|
Khalid Almubarak
|
Samuel Albanie
|
Zaid Alyafeai
|
Albert Webson
|
Edward Raff
|
Colin Raffel
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
FinGPT: Large Generative Models for a Small Language
Risto Luukkonen
|
Ville Komulainen
|
Jouni Luoma
|
Anni Eskelinen
|
Jenna Kanerva
|
Hanna-Mari Kupari
|
Filip Ginter
|
Veronika Laippala
|
Niklas Muennighoff
|
Aleksandra Piktus
|
Thomas Wang
|
Nouamane Tazi
|
Teven Scao
|
Thomas Wolf
|
Osma Suominen
|
Samuli Sairanen
|
Mikko Merioksa
|
Jyrki Heinonen
|
Aija Vahtola
|
Samuel Antao
|
Sampo Pyysalo
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
MTEB: Massive Text Embedding Benchmark
Niklas Muennighoff
|
Nouamane Tazi
|
Loic Magne
|
Nils Reimers
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
2022
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao
|
Thomas Wang
|
Daniel Hesslow
|
Stas Bekman
|
M Saiful Bari
|
Stella Biderman
|
Hady Elsahar
|
Niklas Muennighoff
|
Jason Phang
|
Ofir Press
|
Colin Raffel
|
Victor Sanh
|
Sheng Shen
|
Lintang Sutawika
|
Jaesung Tae
|
Zheng Xin Yong
|
Julien Launay
|
Iz Beltagy
Findings of the Association for Computational Linguistics: EMNLP 2022
Co-authors
- Zheng Xin Yong 4
- Alham Fikri Aji 3
- M Saiful Bari 3
- Lintang Sutawika* 3
- Stella Biderman 3
- show all...
- Thomas Wang 3
- Iz Beltagy 3
- Sebastian Ruder 2
- Genta Indra Winata 2
- Hailey Schoelkopf 2
- Khalid Almubarak 2
- Edward Raff 2
- Dragomir Radev 2
- Teven Le Scao 2
- Sheng Shen 2
- Zaid Alyafeai 2
- Colin Raffel 2
- Nouamane Tazi 2
- Shivalika Singh 2
- Freddie Vargus 2
- Daniel D’souza 2
- Wei-Yin Ko 2
- Julia Kreutzer 2
- Ahmet Üstün 2
- Marzieh Fadaee 2
- Sara Hooker 2
- Luca Soldaini 2
- Rodney Kinney 2
- Akshita Bhagia 2
- Dustin Schwenk 2
- David Atkinson 2
- Russell Authur 2
- Khyathi Chandu 2
- Jennifer Dumas 2
- Yanai Elazar 2
- Ananya Jha 2
- Nathan Lambert 2
- Ian Magnusson 2
- Jacob Morrison 2
- Aakanksha Naik 2
- Crystal Nam 2
- Matthew E. Peters 2
- Abhilasha Ravichander 2
- Kyle Richardson 2
- Emma Strubell 2
- Nishant Subramani 2
- Oyvind Tafjord 2
- Evan Walsh 2
- Luke Zettlemoyer 2
- Noah A. Smith 2
- Hannaneh Hajishirzi 2
- Dirk Groeneveld 2
- Jesse Dodge 2
- Kyle Lo 2
- Holy Lovenia 1
- Rahmad Mahendra 1
- Salsabil Maulana Akbar 1
- Lester James Validad Miranda 1
- Jennifer Santoso 1
- Elyanah Aco 1
- Akhdan Fadhilah 1
- Jonibek Mansurov 1
- Joseph Marvin Imperial 1
- Onno P. Kampman 1
- Joel Ruben Antony Moniz 1
- Muhammad Ravi Shulthan Habibi 1
- Frederikus Hudi 1
- Jann Railey Montalan 1
- Ryan Ignatius Hadiwijaya 1
- Joanito Agili Lopo 1
- William Nixon 1
- Börje F. Karlsson 1
- James Jaya 1
- Ryandito Diandaru 1
- Yuze Gao 1
- Patrick Amadeus Irawan 1
- Bin Wang 1
- Jan Christian Blaise Cruz 1
- Chenxi Whitehouse 1
- Ivan Halim Parmonangan 1
- Maria Khelli 1
- Wenyu Zhang 1
- Lucky Susanto 1
- Reynard Adha Ryanda 1
- Sonny Lazuardi Hermawan 1
- Dan John Velasco 1
- Muhammad Dehan Al Kautsar 1
- Willy Fitra Hendria 1
- Yasmin Moslem 1
- Noah Flynn 1
- Muhammad Farid Adilazuarda 1
- Haochen Li 1
- Johanes Lee 1
- R. Damanhuri 1
- Shuo Sun 1
- Muhammad Reza Qorib 1
- Amirbek Djanibekov 1
- Wei Qi Leong 1
- Quyet V. Do 1
- Tanrada Pansuwan 1
- Ilham Firdausi Putra 1
- Yan Xu 1
- Tai Ngee Chia 1
- Ayu Purwarianti 1
- William Chandra Tjhi 1
- Peerat Limkonchotiwat 1
- Sedrick Keh 1
- Ruochen Zhang 1
- Fajri Koto 1
- Samuel Cahyawijaya 1
- David Ifeoluwa Adelani 1
- Jungo Kasai 1
- Ahmed Baruwa 1
- Vassilina Nikoulina 1
- Adam Roberts 1
- Xiangru Tang 1
- Samuel Albanie 1
- Albert Webson 1
- Risto Luukkonen 1
- Ville Komulainen 1
- Jouni Luoma 1
- Anni Eskelinen 1
- Jenna Kanerva 1
- Hanna-Mari Kupari 1
- Filip Ginter 1
- Veronika Laippala 1
- Aleksandra Piktus 1
- Teven Scao 1
- Thomas Wolf 1
- Osma Suominen 1
- Samuli Sairanen 1
- Mikko Merioksa 1
- Jyrki Heinonen 1
- Aija Vahtola 1
- Samuel Antao 1
- Sampo Pyysalo 1
- Loic Magne 1
- Nils Reimers 1
- Daniel Hesslow 1
- Stas Bekman 1
- Hady Elsahar 1
- Jason Phang 1
- Ofir Press 1
- Victor Sanh 1
- Jaesung Tae 1
- Julien Launay 1
- Börje Karlsson 1
- Abinaya Mahendiran 1
- Herumb Shandilya 1
- Jay Patel 1
- Deividas Mataciunas 1
- Laura O’Mahony 1
- Mike Zhang 1
- Ramith Hettiarachchi 1
- Joseph Wilson 1
- Marina Machado 1
- Luisa Moura 1
- Dominik Krzemiński 1
- Hakimeh Fadaee 1
- Irem Ergun 1
- Ifeoma Okoh 1
- Aisha Alaagib 1
- Oshan Mudannayake 1
- Vu Chien 1
- Surya Guthikonda 1
- Emad Alghamdi 1
- Sebastian Gehrmann 1
- Max Bartolo 1
- Ben Bogin 1
- Valentin Hofmann 1
- Sachin Kumar 1
- Li Lucy 1
- Xinxi Lyu 1
- Zejiang Shen 1
- Hamish Ivison 1
- Yizhong Wang 1
- Shane Arora 1
- Arman Cohan 1
- Yuling Gu 1
- Jack Hessel 1
- Tushar Khot 1
- William Merrill 1
- Valentina Pyatkin 1
- Saurabh Shah 1
- William Smith 1
- Mitchell Wortsman 1
- Pradeep Dasigi 1
- Viraat Aryabumi 1
- Zheng Yong 1
- Gbemileke Onilude 1
- Neel Bhandari 1
- Hui-Lee Ooi 1
- Amr Kayid 1
- Phil Blunsom 1
- Shayne Longpre 1