pdf bib oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimesDaniel Campos | Alexandre Marques | Mark Kurtz | Cheng Xiang ZhaiProceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)