An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

Shudong Hao, Michael J. Paul


Abstract
Probabilistic topic modeling is a common first step in crosslingual tasks to enable knowledge transfer and extract multilingual features. Although many multilingual topic models have been developed, their assumptions about the training corpus are quite varied, and it is not clear how well the different models can be utilized under various training conditions. In this article, the knowledge transfer mechanisms behind different multilingual topic models are systematically studied, and through a broad set of experiments with four models on ten languages, we provide empirical insights that can inform the selection and future development of multilingual topic models.
Anthology ID:
2020.cl-1.3
Volume:
Computational Linguistics, Volume 46, Issue 1 - March 2020
Month:
Year:
2020
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
95–134
Language:
URL:
https://aclanthology.org/2020.cl-1.3
DOI:
10.1162/coli_a_00369
Bibkey:
Cite (ACL):
Shudong Hao and Michael J. Paul. 2020. An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models. Computational Linguistics, 46(1):95–134.
Cite (Informal):
An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models (Hao & Paul, CL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.cl-1.3.pdf