Time is Encoded in the Weights of Finetuned Language Models

Kai Nylund; Suchin Gururangan; Noah A. Smith

doi:10.18653/v1/2024.acl-long.141

Time is Encoded in the Weights of Finetuned Language Models

Kai Nylund, Suchin Gururangan, Noah Smith

Abstract

We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.

Anthology ID:: 2024.luhme-long.141
Volume:: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2571–2587
Language:
URL:: https://aclanthology.org/2024.luhme-long.141/
DOI:: 10.18653/v1/2024.acl-long.141
Bibkey:
Cite (ACL):: Kai Nylund, Suchin Gururangan, and Noah Smith. 2024. Time is Encoded in the Weights of Finetuned Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2571–2587, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Time is Encoded in the Weights of Finetuned Language Models (Nylund et al., ACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.acl-long.141.pdf

PDF Cite Search Fix data