Junyi Chai


2022

pdf bib
DeepGen: Diverse Search Ad Generation and Real-Time Customization
Konstantin Golobokov | Junyi Chai | Victor Ye Dong | Mandy Gu | Bingyu Chi | Jie Cao | Yulan Yan | Yi Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Demo: https://youtu.be/WQLL93TPB-cAbstract:We present DeepGen, a system deployed at web scale for automatically creating sponsored search advertisements (ads) for BingAds customers. We leverage state-of-the-art natural language generation (NLG) models to generate fluent ads from advertiser’s web pages in an abstractive fashion and solve practical issues such as factuality and inference speed. In addition, our system creates a customized ad in real-time in response to the user’s search query, therefore highlighting different aspects of the same product based on what the user is looking for. To achieve this, our system generates a diverse choice of smaller pieces of the ad ahead of time and, at query time, selects the most relevant ones to be stitched into a complete ad. We improve generation diversity by training a controllable NLG model to generate multiple ads for the same web page highlighting different selling points. Our system design further improves diversity horizontally by first running an ensemble of generation models trained with different objectives and then using a diversity sampling algorithm to pick a diverse subset of generation results for online selection. Experimental results show the effectiveness of our proposed system design. Our system is currently deployed in production, serving ~4% of global ads served in Bing.

2021

pdf bib
Automatic Construction of Enterprise Knowledge Base
Junyi Chai | Yujie He | Homa Hashemi | Bing Li | Daraksha Parveen | Ranganath Kondapally | Wenjin Xu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

In this paper, we present an automatic knowledge base construction system from large scale enterprise documents with minimal efforts of human intervention. In the design and deployment of such a knowledge mining system for enterprise, we faced several challenges including data distributional shift, performance evaluation, compliance requirements and other practical issues. We leveraged state-of-the-art deep learning models to extract information (named entities and definitions) at per document level, then further applied classical machine learning techniques to process global statistical information to improve the knowledge base. Experimental results are reported on actual enterprise documents. This system is currently serving as part of a Microsoft 365 service.