Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability Tyler A Chang author Zhuowen Tu author Benjamin K Bergen author 2024 text journal article Transactions of the Association for Computational Linguistics continuing MIT Press Cambridge, MA periodical academic journal chang-etal-2024-characterizing 10.1162/tacl_a_00708 https://aclanthology.org/2024.tacl-1.74/ 2024 12 1346 1362