Byte Pair Encoding for Symbolic Music

Nathan Fradet; Nicolas Gutowski; Fabien Chhel; Jean-Pierre Briot

doi:10.18653/v1/2023.emnlp-main.123

Byte Pair Encoding for Symbolic Music

Nathan Fradet, Nicolas Gutowski, Fabien Chhel, Jean-Pierre Briot

Abstract

When used with deep learning, the symbolic music modality is often coupled with language model architectures. To do so, the music needs to be tokenized, i.e. converted into a sequence of discrete tokens. This can be achieved by different approaches, as music can be composed of simultaneous tracks, of simultaneous notes with several attributes. Until now, the proposed tokenizations rely on small vocabularies of tokens describing the note attributes and time events, resulting in fairly long token sequences, and a sub-optimal use of the embedding space of language models. Recent research has put efforts on reducing the overall sequence length by merging embeddings or combining tokens. In this paper, we show that Byte Pair Encoding, a compression technique widely used for natural language, significantly decreases the sequence length while increasing the vocabulary size. By doing so, we leverage the embedding capabilities of such models with more expressive tokens, resulting in both better results and faster inference in generation and classification tasks. The [source code is shared on Github](https://github.com/Natooz/bpe-symbolic-music), along with a [companion website](https://Natooz.github.io/BPE-Symbolic-Music). Finally, BPE is directly implemented in [MidiTok](https://github.com/Natooz/MidiTok), allowing the reader to easily benefit from this method.

Anthology ID:: 2023.emnlp-main.123
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2001–2020
Language:
URL:: https://aclanthology.org/2023.emnlp-main.123
DOI:: 10.18653/v1/2023.emnlp-main.123
Bibkey:
Cite (ACL):: Nathan Fradet, Nicolas Gutowski, Fabien Chhel, and Jean-Pierre Briot. 2023. Byte Pair Encoding for Symbolic Music. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2001–2020, Singapore. Association for Computational Linguistics.
Cite (Informal):: Byte Pair Encoding for Symbolic Music (Fradet et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.123.pdf
Video:: https://aclanthology.org/2023.emnlp-main.123.mp4

PDF Cite Search Video