%0 Conference Proceedings %T Multiclass Text Classification on Unbalanced, Sparse and Noisy Data %A Dönicke, Tillmann %A Damaschk, Matthias %A Lux, Florian %Y Nivre, Joakim %Y Derczynski, Leon %Y Ginter, Filip %Y Lindi, Bjørn %Y Oepen, Stephan %Y Søgaard, Anders %Y Tidemann, Jörg %S Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing %D 2019 %8 September %I Linköping University Electronic Press %C Turku, Finland %F damaschk-etal-2019-multiclass %X This paper discusses methods to improve the performance of text classification on data that is difficult to classify due to a large number of unbalanced classes with noisy examples. A variety of features are tested, in combination with three different neural-network-based methods with increasing complexity. The classifiers are applied to a songtext–artist dataset which is large, unbalanced and noisy. We come to the conclusion that substantial improvement can be obtained by removing unbalancedness and sparsity from the data. This fulfils a classification task unsatisfactorily—however, with contemporary methods, it is a practical step towards fairly satisfactory results. %U https://aclanthology.org/W19-6207 %P 58-65