Niklas Muennighoff
2024
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Shivalika Singh
|
Freddie Vargus
|
Daniel D’souza
|
Börje Karlsson
|
Abinaya Mahendiran
|
Wei-Yin Ko
|
Herumb Shandilya
|
Jay Patel
|
Deividas Mataciunas
|
Laura O’Mahony
|
Mike Zhang
|
Ramith Hettiarachchi
|
Joseph Wilson
|
Marina Machado
|
Luisa Moura
|
Dominik Krzemiński
|
Hakimeh Fadaei
|
Irem Ergun
|
Ifeoma Okoh
|
Aisha Alaagib
|
Oshan Mudannayake
|
Zaid Alyafeai
|
Vu Chien
|
Sebastian Ruder
|
Surya Guthikonda
|
Emad Alghamdi
|
Sebastian Gehrmann
|
Niklas Muennighoff
|
Max Bartolo
|
Julia Kreutzer
|
Ahmet Üstün
|
Marzieh Fadaee
|
Sara Hooker
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini
|
Rodney Kinney
|
Akshita Bhagia
|
Dustin Schwenk
|
David Atkinson
|
Russell Authur
|
Ben Bogin
|
Khyathi Chandu
|
Jennifer Dumas
|
Yanai Elazar
|
Valentin Hofmann
|
Ananya Jha
|
Sachin Kumar
|
Li Lucy
|
Xinxi Lyu
|
Nathan Lambert
|
Ian Magnusson
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Abhilasha Ravichander
|
Kyle Richardson
|
Zejiang Shen
|
Emma Strubell
|
Nishant Subramani
|
Oyvind Tafjord
|
Evan Walsh
|
Luke Zettlemoyer
|
Noah Smith
|
Hannaneh Hajishirzi
|
Iz Beltagy
|
Dirk Groeneveld
|
Jesse Dodge
|
Kyle Lo
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
|
Iz Beltagy
|
Evan Walsh
|
Akshita Bhagia
|
Rodney Kinney
|
Oyvind Tafjord
|
Ananya Jha
|
Hamish Ivison
|
Ian Magnusson
|
Yizhong Wang
|
Shane Arora
|
David Atkinson
|
Russell Authur
|
Khyathi Chandu
|
Arman Cohan
|
Jennifer Dumas
|
Yanai Elazar
|
Yuling Gu
|
Jack Hessel
|
Tushar Khot
|
William Merrill
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Valentina Pyatkin
|
Abhilasha Ravichander
|
Dustin Schwenk
|
Saurabh Shah
|
William Smith
|
Emma Strubell
|
Nishant Subramani
|
Mitchell Wortsman
|
Pradeep Dasigi
|
Nathan Lambert
|
Kyle Richardson
|
Luke Zettlemoyer
|
Jesse Dodge
|
Kyle Lo
|
Luca Soldaini
|
Noah Smith
|
Hannaneh Hajishirzi
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
|
Viraat Aryabumi
|
Zheng Yong
|
Wei-Yin Ko
|
Daniel D’souza
|
Gbemileke Onilude
|
Neel Bhandari
|
Shivalika Singh
|
Hui-Lee Ooi
|
Amr Kayid
|
Freddie Vargus
|
Phil Blunsom
|
Shayne Longpre
|
Niklas Muennighoff
|
Marzieh Fadaee
|
Julia Kreutzer
|
Sara Hooker
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
|
Rahmad Mahendra
|
Salsabil Maulana Akbar
|
Lester James Validad Miranda
|
Jennifer Santoso
|
Elyanah Aco
|
Akhdan Fadhilah
|
Jonibek Mansurov
|
Joseph Marvin Imperial
|
Onno P. Kampman
|
Joel Ruben Antony Moniz
|
Muhammad Ravi Shulthan Habibi
|
Frederikus Hudi
|
Jann Railey Montalan
|
Ryan Ignatius Hadiwijaya
|
Joanito Agili Lopo
|
William Nixon
|
Börje F. Karlsson
|
James Jaya
|
Ryandito Diandaru
|
Yuze Gao
|
Patrick Amadeus Irawan
|
Bin Wang
|
Jan Christian Blaise Cruz
|
Chenxi Whitehouse
|
Ivan Halim Parmonangan
|
Maria Khelli
|
Wenyu Zhang
|
Lucky Susanto
|
Reynard Adha Ryanda
|
Sonny Lazuardi Hermawan
|
Dan John Velasco
|
Muhammad Dehan Al Kautsar
|
Willy Fitra Hendria
|
Yasmin Moslem
|
Noah Flynn
|
Muhammad Farid Adilazuarda
|
Haochen Li
|
Johanes Lee
|
R. Damanhuri
|
Shuo Sun
|
Muhammad Reza Qorib
|
Amirbek Djanibekov
|
Wei Qi Leong
|
Quyet V. Do
|
Niklas Muennighoff
|
Tanrada Pansuwan
|
Ilham Firdausi Putra
|
Yan Xu
|
Tai Ngee Chia
|
Ayu Purwarianti
|
Sebastian Ruder
|
William Chandra Tjhi
|
Peerat Limkonchotiwat
|
Alham Fikri Aji
|
Sedrick Keh
|
Genta Indra Winata
|
Ruochen Zhang
|
Fajri Koto
|
Zheng Xin Yong
|
Samuel Cahyawijaya
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
2023
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Zheng Xin Yong
|
Hailey Schoelkopf
|
Niklas Muennighoff
|
Alham Fikri Aji
|
David Ifeoluwa Adelani
|
Khalid Almubarak
|
M Saiful Bari
|
Lintang Sutawika
|
Jungo Kasai
|
Ahmed Baruwa
|
Genta Winata
|
Stella Biderman
|
Edward Raff
|
Dragomir Radev
|
Vassilina Nikoulina
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Crosslingual Generalization through Multitask Finetuning
Niklas Muennighoff
|
Thomas Wang
|
Lintang Sutawika
|
Adam Roberts
|
Stella Biderman
|
Teven Le Scao
|
M Saiful Bari
|
Sheng Shen
|
Zheng Xin Yong
|
Hailey Schoelkopf
|
Xiangru Tang
|
Dragomir Radev
|
Alham Fikri Aji
|
Khalid Almubarak
|
Samuel Albanie
|
Zaid Alyafeai
|
Albert Webson
|
Edward Raff
|
Colin Raffel
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
MTEB: Massive Text Embedding Benchmark
Niklas Muennighoff
|
Nouamane Tazi
|
Loic Magne
|
Nils Reimers
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
FinGPT: Large Generative Models for a Small Language
Risto Luukkonen
|
Ville Komulainen
|
Jouni Luoma
|
Anni Eskelinen
|
Jenna Kanerva
|
Hanna-Mari Kupari
|
Filip Ginter
|
Veronika Laippala
|
Niklas Muennighoff
|
Aleksandra Piktus
|
Thomas Wang
|
Nouamane Tazi
|
Teven Scao
|
Thomas Wolf
|
Osma Suominen
|
Samuli Sairanen
|
Mikko Merioksa
|
Jyrki Heinonen
|
Aija Vahtola
|
Samuel Antao
|
Sampo Pyysalo
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
2022
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao
|
Thomas Wang
|
Daniel Hesslow
|
Stas Bekman
|
M Saiful Bari
|
Stella Biderman
|
Hady Elsahar
|
Niklas Muennighoff
|
Jason Phang
|
Ofir Press
|
Colin Raffel
|
Victor Sanh
|
Sheng Shen
|
Lintang Sutawika
|
Jaesung Tae
|
Zheng Xin Yong
|
Julien Launay
|
Iz Beltagy
Findings of the Association for Computational Linguistics: EMNLP 2022
Co-authors
- Zheng-Xin Yong 4
- Alham Fikri Aji 3
- M Saiful Bari 3
- Iz Beltagy 3
- Stella Biderman 3
- show all...
- Lintang Sutawika* 3
- Thomas Wang 3
- Khalid Almubarak 2
- Zaid Alyafeai 2
- David Atkinson 2
- Russell Authur 2
- Akshita Bhagia 2
- Khyathi Chandu 2
- Jesse Dodge 2
- Jennifer Dumas 2
- Daniel D’souza 2
- Yanai Elazar 2
- Marzieh Fadaee 2
- Dirk Groeneveld 2
- Hannaneh Hajishirzi 2
- Sara Hooker 2
- Ananya Jha 2
- Rodney Kinney 2
- Wei-Yin Ko 2
- Julia Kreutzer 2
- Nathan Lambert 2
- Teven Le Scao 2
- Kyle Lo 2
- Ian Magnusson 2
- Jacob Morrison 2
- Aakanksha Naik 2
- Crystal Nam 2
- Matthew E. Peters 2
- Dragomir Radev 2
- Edward Raff 2
- Colin Raffel 2
- Abhilasha Ravichander 2
- Kyle Richardson 2
- Sebastian Ruder 2
- Hailey Schoelkopf 2
- Dustin Schwenk 2
- Sheng Shen 2
- Shivalika Singh 2
- Noah A. Smith 2
- Luca Soldaini 2
- Emma Strubell 2
- Nishant Subramani 2
- Oyvind Tafjord 2
- Nouamane Tazi 2
- Freddie Vargus 2
- Evan Walsh 2
- Genta Indra Winata 2
- Luke Zettlemoyer 2
- Ahmet Üstün 2
- Elyanah Aco 1
- David Ifeoluwa Adelani 1
- Muhammad Farid Adilazuarda 1
- Salsabil Maulana Akbar 1
- Muhammad Dehan Al Kautsar 1
- Aisha Alaagib 1
- Samuel Albanie 1
- Emad Alghamdi 1
- Samuel Antao 1
- Shane Arora 1
- Viraat Aryabumi 1
- Max Bartolo 1
- Ahmed Baruwa 1
- Stas Bekman 1
- Neel Bhandari 1
- Phil Blunsom 1
- Ben Bogin 1
- Samuel Cahyawijaya 1
- Tai Ngee Chia 1
- Vu Chien 1
- Arman Cohan 1
- Jan Christian Blaise Cruz 1
- R. Damanhuri 1
- Pradeep Dasigi 1
- Ryandito Diandaru 1
- Amirbek Djanibekov 1
- Quyet V. Do 1
- Hady Elsahar 1
- Irem Ergun 1
- Anni Eskelinen 1
- Hakimeh Fadaee 1
- Akhdan Fadhilah 1
- Noah Flynn 1
- Yuze Gao 1
- Sebastian Gehrmann 1
- Filip Ginter 1
- Yuling Gu 1
- Surya Guthikonda 1
- Muhammad Ravi Shulthan Habibi 1
- Ryan Ignatius Hadiwijaya 1
- Jyrki Heinonen 1
- Willy Fitra Hendria 1
- Sonny Lazuardi Hermawan 1
- Jack Hessel 1
- Daniel Hesslow 1
- Ramith Hettiarachchi 1
- Valentin Hofmann 1
- Frederikus Hudi 1
- Joseph Marvin Imperial 1
- Patrick Amadeus Irawan 1
- Hamish Ivison 1
- James Jaya 1
- Onno P. Kampman 1
- Jenna Kanerva 1
- Börje Karlsson 1
- Börje F. Karlsson 1
- Jungo Kasai 1
- Amr Kayid 1
- Sedrick Keh 1
- Maria Khelli 1
- Tushar Khot 1
- Ville Komulainen 1
- Fajri Koto 1
- Dominik Krzemiński 1
- Sachin Kumar 1
- Hanna-Mari Kupari 1
- Veronika Laippala 1
- Julien Launay 1
- Johanes Lee 1
- Wei Qi Leong 1
- Haochen Li 1
- Peerat Limkonchotiwat 1
- Shayne Longpre 1
- Joanito Agili Lopo 1
- Holy Lovenia 1
- Li Lucy 1
- Jouni Luoma 1
- Risto Luukkonen 1
- Xinxi Lyu 1
- Marina Machado 1
- Loic Magne 1
- Abinaya Mahendiran 1
- Rahmad Mahendra 1
- Jonibek Mansurov 1
- Deividas Mataciunas 1
- Mikko Merioksa 1
- William Merrill 1
- Lester James Validad Miranda 1
- Joel Ruben Antony Moniz 1
- Jann Railey Montalan 1
- Yasmin Moslem 1
- Luisa Moura 1
- Oshan Mudannayake 1
- Vassilina Nikoulina 1
- William Nixon 1
- Ifeoma Okoh 1
- Gbemileke Onilude 1
- Hui-Lee Ooi 1
- Laura O’Mahony 1
- Tanrada Pansuwan 1
- Ivan Halim Parmonangan 1
- Jay Patel 1
- Jason Phang 1
- Aleksandra Piktus 1
- Ofir Press 1
- Ayu Purwarianti 1
- Ilham Firdausi Putra 1
- Valentina Pyatkin 1
- Sampo Pyysalo 1
- Muhammad Reza Qorib 1
- Nils Reimers 1
- Adam Roberts 1
- Reynard Adha Ryanda 1
- Samuli Sairanen 1
- Victor Sanh 1
- Jennifer Santoso 1
- Teven Scao 1
- Saurabh Shah 1
- Herumb Shandilya 1
- Zejiang Shen 1
- William Smith 1
- Shuo Sun 1
- Osma Suominen 1
- Lucky Susanto 1
- Jaesung Tae 1
- Xiangru Tang 1
- William Chandra Tjhi 1
- Aija Vahtola 1
- Dan John Velasco 1
- Yizhong Wang 1
- Bin Wang 1
- Albert Webson 1
- Chenxi Whitehouse 1
- Joseph Wilson 1
- Thomas Wolf 1
- Mitchell Wortsman 1
- Yan Xu 1
- Zheng Yong 1
- Mike Zhang 1
- Wenyu Zhang 1
- Ruochen Zhang 1