Cross-Task Generalization Abilities of Large Language Models

Qinyuan Ye

doi:10.18653/v1/2024.naacl-srw.27

Cross-Task Generalization Abilities of Large Language Models

Abstract

Humans can learn a new language task efficiently with only few examples, by leveraging their knowledge and experience obtained when learning prior tasks. Enabling similar cross-task generalization abilities in NLP systems is fundamental for approaching the goal of general intelligence and expanding the reach of language technology in the future.In this thesis proposal, I will present my work on (1) benchmarking cross-task generalization abilities with diverse NLP tasks; (2) developing model architectures for improving cross-task generalization abilities; (3) analyzing and predicting the generalization landscape of current state-of-the-art large language models. Additionally, I will outline future research directions, along with preliminary thoughts on addressing them.

Anthology ID:: 2024.naacl-srw.27
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Yang (Trista) Cao, Isabel Papadimitriou, Anaelia Ovalle, Marcos Zampieri, Francis Ferraro, Swabha Swayamdipta
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 255–262
Language:
URL:: https://aclanthology.org/2024.naacl-srw.27/
DOI:: 10.18653/v1/2024.naacl-srw.27
Bibkey:
Cite (ACL):: Qinyuan Ye. 2024. Cross-Task Generalization Abilities of Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 255–262, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Cross-Task Generalization Abilities of Large Language Models (Ye, NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-srw.27.pdf

PDF Cite Search Fix data