Matrix and Double-Array Representations for Efficient Finite State Tokenization Nils Diewald author 2022-06 text Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10) Piotr Banski editor Adrien Barbaresi editor Simon Clematide editor Marc Kupietz editor Harald Lüngen editor European Language Resources Association Marseille, France conference publication diewald-2022-matrix https://aclanthology.org/2022.cmlc-1.4/ 2022-06 20 26