Deep Generalized Canonical Correlation Analysis

Adrian Benton, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, Raman Arora


Abstract
We present Deep Generalized Canonical Correlation Analysis (DGCCA) – a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst, 1961)) exist, DGCCA combines the flexibility of nonlinear (deep) representation learning with the statistical power of incorporating information from many sources, or views. We present the DGCCA formulation as well as an efficient stochastic optimization algorithm for solving it. We learn and evaluate DGCCA representations for three downstream tasks: phonetic transcription from acoustic & articulatory measurements, recommending hashtags and recommending friends on a dataset of Twitter users.
Anthology ID:
W19-4301
Volume:
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/W19-4301
DOI:
10.18653/v1/W19-4301
Bibkey:
Cite (ACL):
Adrian Benton, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, and Raman Arora. 2019. Deep Generalized Canonical Correlation Analysis. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 1–6, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Deep Generalized Canonical Correlation Analysis (Benton et al., RepL4NLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4301.pdf
Code
 adrianbenton/dgcca-py3 +  additional community code