Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification

Yingbo Gao, Christian Herold, Weiyue Wang, Hermann Ney


Abstract
Prominently used in support vector machines and logistic re-gressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In this work, by replacing the inner product function in the softmax layer, we explore the use of kernels for contextual word classification. In order to compare the individual kernels, experiments are conducted on standard language modeling and machine translation tasks. We observe a wide range of performances across different kernel settings. Extending the results, we look at the gradient properties, investigate various mixture strategies and examine the disambiguation abilities.
Anthology ID:
2019.iwslt-1.24
Volume:
Proceedings of the 16th International Conference on Spoken Language Translation
Month:
November 2-3
Year:
2019
Address:
Hong Kong
Editors:
Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2019.iwslt-1.24
DOI:
Bibkey:
Cite (ACL):
Yingbo Gao, Christian Herold, Weiyue Wang, and Hermann Ney. 2019. Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification. In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification (Gao et al., IWSLT 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.iwslt-1.24.pdf