Improving Document Clustering by Removing Unnatural Language Myungha Jang author Jinho D Choi author James Allan author 2017-09 text Proceedings of the 3rd Workshop on Noisy User-generated Text Leon Derczynski editor Wei Xu editor Alan Ritter editor Tim Baldwin editor Association for Computational Linguistics Copenhagen, Denmark conference publication jang-etal-2017-improving 10.18653/v1/W17-4416 https://aclanthology.org/W17-4416/ 2017-09 122 130