A Novel Matching Paradigm: Unified Generative and Discriminative LLM with Prompt Compression for Relevance Learning

Guoliang Zhao; Zixin Cui; Chao Ye; Dengwu He; Fei Huang; Yubo Liu; Shuanglong Li; Tzungren Kuo; Bin Ding; Shuang Zhang; KunhongZhu; Zhi Guo; Liu Lin

A Novel Matching Paradigm: Unified Generative and Discriminative LLM with Prompt Compression for Relevance Learning

Guoliang Zhao, Zixin Cui, Chao Ye, Dengwu He, Fei Huang, Yubo Liu, Shuanglong Li, Tzungren Kuo, Bin Ding, Shuang Zhang, KunhongZhu, Zhi Guo, Liu Lin

Abstract

The matching paradigm is fundamental to large-scale information retrieval and is widely used in industrial search and advertising systems. Existing approaches employ Large Language Models (LLMs) primarily as feature extractors, underutilizing their full modeling capabilities. To address this limitation, we propose a novel matching paradigm, termed the Unified Generative and Discriminative LLM (UGD). It integrates two-tower, single-tower, and generative tasks within a unified LLM framework via attention-mask partitioning, enabling generative tasks to serve as auxiliary supervision for discriminative learning and facilitating distillation from single-tower to two-tower architectures through a multi-task fine-tuning mechanism. To satisfy online latency constraints, we further introduce a self-distillation variant of UGD with a KMeans-enhanced linearized RQVAE for prompt compression and quantization. This design compresses and quantizes landing-page documents during inference, improving serving efficiency and reducing storage overhead. Extensive experiments show that UGD achieves superior performance and strong practical value. The framework has been deployed in an industrial search engine serving hundreds of millions of users and hundreds of thousands of advertisers, significantly enhancing search experience. Open access upon publication.

Anthology ID:: 2026.acl-industry.122
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1783–1794
Language:
URL:: https://aclanthology.org/2026.acl-industry.122/
DOI:
Bibkey:
Cite (ACL):: Guoliang Zhao, Zixin Cui, Chao Ye, Dengwu He, Fei Huang, Yubo Liu, Shuanglong Li, Tzungren Kuo, Bin Ding, Shuang Zhang, KunhongZhu, Zhi Guo, and Liu Lin. 2026. A Novel Matching Paradigm: Unified Generative and Discriminative LLM with Prompt Compression for Relevance Learning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1783–1794, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: A Novel Matching Paradigm: Unified Generative and Discriminative LLM with Prompt Compression for Relevance Learning (Zhao et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-industry.122.pdf

PDF Cite Search Fix data