Shuxi Guo


2025

pdf bib
RecStream: Graph-aware Stream Management for Concurrent Recommendation Model Online Serving
Shuxi Guo | Qi Qi | Haifeng Sun | Jianxin Liao | Jingyu Wang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

Recommendation Models (RMs) are crucial for predicting user preferences and enhancing personalized experiences on large-scale platforms. As the application of recommendation models grows, optimizing their online serving performance has become a significant challenge. However, current serving systems perform poorly under highly concurrent scenarios. To address this, we introduce RecStream, a system designed to optimize stream configurations based on model characteristics for handling high concurrency requests. We employ a hybrid Graph Neural Network architecture to determine the best configurations for various RMs. Experimental results demonstrate that RecStream achieves significant performance improvements, reducing latency by up to 74%.