@inproceedings{thompson-etal-2018-freezing,
    title = "Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation",
    author = "Thompson, Brian  and
      Khayrallah, Huda  and
      Anastasopoulos, Antonios  and
      McCarthy, Arya D.  and
      Duh, Kevin  and
      Marvin, Rebecca  and
      McNamee, Paul  and
      Gwinnup, Jeremy  and
      Anderson, Tim  and
      Koehn, Philipp",
    booktitle = "Proceedings of the Third Conference on Machine Translation: Research Papers",
    month = oct,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W18-6313",
    doi = "10.18653/v1/W18-6313",
    pages = "124--132",
    abstract = "To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component{'}s contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.",
}
