Vitor Amancio Jeronymo
2026
Compressing Language Models for Specialized Domains
Miles Williams | George Chrysostomou | Vitor Amancio Jeronymo | Nikolaos Aletras
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Miles Williams | George Chrysostomou | Vitor Amancio Jeronymo | Nikolaos Aletras
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Language models (LMs) excel at tasks across diverse domains, yet require substantial computational resources during inference. Compression techniques such as pruning and quantization offer a practical path towards efficient LM deployment, exemplified by their ability to preserve performance on general-purpose benchmarks. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this issue, but requires a computationally expensive full-parameter fine-tuning pipeline. To this end, we propose MixCal, a novel calibration method designed to improve the in-domain performance of compressed LMs in a post-training setting. Through extensive experimentation, we demonstrate that MixCal substantially outperforms existing approaches on domain-specific tasks while preserving general performance. Notably, these performance gains are achieved while also reducing the computational cost of LM compression.