Improving Tokenisation by Alternative Treatment of Spaces

Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton, Aline Villavicencio


Anthology ID:
2022.emnlp-main.786
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11430–11443
Language:
URL:
https://aclanthology.org/2022.emnlp-main.786
DOI:
10.18653/v1/2022.emnlp-main.786
Bibkey:
Cite (ACL):
Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton, and Aline Villavicencio. 2022. Improving Tokenisation by Alternative Treatment of Spaces. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11430–11443, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Improving Tokenisation by Alternative Treatment of Spaces (Gow-Smith et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.786.pdf