Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity
Po-Nien Kung | Sheng-Siang Yin | Yi-Cheng Chen | Tse-Hsuan Yang | Yun-Nung Chen
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Multi-task auxiliary learning utilizes a set of relevant auxiliary tasks to improve the performance of a primary task. A common usage is to manually select multiple auxiliary tasks for multi-task learning on all data, which raises two issues: (1) selecting beneficial auxiliary tasks for a primary task is nontrivial; (2) when the auxiliary datasets are large, training on all data becomes time-expensive and impractical. Therefore, this paper focuses on addressing these problems and proposes a time-efficient sampling method to select the data that is most relevant to the primary task. The proposed method allows us to only train on the most beneficial sub-datasets from the auxiliary tasks, achieving efficient multi-task auxiliary learning. The experiments on three benchmark datasets (RTE, MRPC, STS-B) show that our method significantly outperforms random sampling and ST-DNN. Also, by applying our method, the model can surpass fully-trained MT-DNN on RTE, MRPC, STS-B, using only 50%, 66%, and 1% of data, respectively.
Zero-Shot Rationalization by Multi-Task Transfer Learning from Question Answering
Po-Nien Kung | Tse-Hsuan Yang | Yi-Cheng Chen | Sheng-Siang Yin | Yun-Nung Chen
Findings of the Association for Computational Linguistics: EMNLP 2020
Extracting rationales can help human understand which information the model utilizes and how it makes the prediction towards better interpretability. However, annotating rationales requires much effort and only few datasets contain such labeled rationales, making supervised learning for rationalization difficult. In this paper, we propose a novel approach that leverages the benefits of both multi-task learning and transfer learning for generating rationales through question answering in a zero-shot fashion. For two benchmark rationalization datasets, the proposed method achieves comparable or even better performance of rationalization without any supervised signal, demonstrating the great potential of zero-shot rationalization for better interpretability.