Shuxi Guo
2025
RecStream: Graph-aware Stream Management for Concurrent Recommendation Model Online Serving
Shuxi Guo
|
Qi Qi
|
Haifeng Sun
|
Jianxin Liao
|
Jingyu Wang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Recommendation Models (RMs) are crucial for predicting user preferences and enhancing personalized experiences on large-scale platforms. As the application of recommendation models grows, optimizing their online serving performance has become a significant challenge. However, current serving systems perform poorly under highly concurrent scenarios. To address this, we introduce RecStream, a system designed to optimize stream configurations based on model characteristics for handling high concurrency requests. We employ a hybrid Graph Neural Network architecture to determine the best configurations for various RMs. Experimental results demonstrate that RecStream achieves significant performance improvements, reducing latency by up to 74%.