Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices

Liu Zai; Iraklis A. Klampanos

Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices

Abstract

Pretokenization is a crucial, sequential pass in Byte-level BPE tokenizers, yet little work has been done to optimize it for edge-side inference. Our proposed new implementation, Peek2, serves as a drop-in replacement for cl100k-like pretokenizers used in GPT-3, LLaMa-3, and Qwen-2.5. After breaking down and analyzing the logic of the original cl100k pretokenizer, we introduced a new pretokenization algorithm with linear time complexity and constant, trivial memory usage, suited for edge scenarios. Test results show that it increases microbenchmarking throughput by up to 2.48× and delivers a 1.14× improvement in overall throughput across the entire Byte-level BPE encoding process, depending on the dataset, while providing identical results as the baseline Regex-based tokenizer.

Anthology ID:: 2026.acl-srw.10
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 123–128
Language:
URL:: https://aclanthology.org/2026.acl-srw.10/
DOI:
Bibkey:
Cite (ACL):: Liu Zai and Iraklis A. Klampanos. 2026. Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 123–128, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices (Zai & Klampanos, ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-srw.10.pdf

PDF Cite Search Fix data