Marcello Hasegawa
2021
Privacy Regularization: Joint Privacy-Utility Optimization in LanguageModels
Fatemehsadat Mireshghallah
|
Huseyin Inan
|
Marcello Hasegawa
|
Victor Rühle
|
Taylor Berg-Kirkpatrick
|
Robert Sim
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Neural language models are known to have a high capacity for memorization of training samples. This may have serious privacy im- plications when training models on user content such as email correspondence. Differential privacy (DP), a popular choice to train models with privacy guarantees, comes with significant costs in terms of utility degradation and disparate impact on subgroups of users. In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a novel triplet-loss term. We compare our methods with DP through extensive evaluation. We show the advantages of our regularizers with favorable utility-privacy trade-off, faster training with the ability to tap into existing optimization approaches, and ensuring uniform treatment of under-represented subgroups.
2020
Smart To-Do: Automatic Generation of To-Do Items from Emails
Sudipto Mukherjee
|
Subhabrata Mukherjee
|
Marcello Hasegawa
|
Ahmed Hassan Awadallah
|
Ryen White
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Intelligent features in email service applications aim to increase productivity by helping people organize their folders, compose their emails and respond to pending tasks. In this work, we explore a new application, Smart-To-Do, that helps users with task management over emails. We introduce a new task and dataset for automatically generating To-Do items from emails where the sender has promised to perform an action. We design a two-stage process leveraging recent advances in neural text generation and sequence-to-sequence learning, obtaining BLEU and ROUGE scores of 0.23 and 0.63 for this task. To the best of our knowledge, this is the first work to address the problem of composing To-Do items from emails.