Huangpan Zhang
2019
ltl.uni-due at SemEval-2019 Task 5: Simple but Effective Lexico-Semantic Features for Detecting Hate Speech in Twitter
Huangpan Zhang
|
Michael Wojatzki
|
Tobias Horsmann
|
Torsten Zesch
Proceedings of the 13th International Workshop on Semantic Evaluation
In this paper, we present our contribution to SemEval 2019 Task 5 Multilingual Detection of Hate, specifically in the Subtask A (English and Spanish). We compare different configurations of shallow and deep learning approaches on the English data and use the system that performs best in both sub-tasks. The resulting SVM-based system with lexicosemantic features (n-grams and embeddings) is ranked 23rd out of 69 on the English data and beats the baseline system. On the Spanish data our system is ranked 25th out of 39.
2017
Projection of Argumentative Corpora from Source to Target Languages
Ahmet Aker
|
Huangpan Zhang
Proceedings of the 4th Workshop on Argument Mining
Argumentative corpora are costly to create and are available in only few languages with English dominating the area. In this paper we release the first publicly available Mandarin argumentative corpus. The corpus is created by exploiting the idea of comparable corpora from Statistical Machine Translation. We use existing corpora in English and manually map the claims and premises to comparable corpora in Mandarin. We also implement a simple solution to automate this approach with the view of creating argumentative corpora in other less-resourced languages. In this way we introduce a new task of multi-lingual argument mapping that can be evaluated using our English-Mandarin argumentative corpus. The preliminary results of our automatic argument mapper mirror the simplicity of our approach, but provide a baseline for further improvements.