Piotr Andruszkiewicz


pdf bib
Multilingual Entity and Relation Extraction Dataset and Model
Alessandro Seganti | Klaudia Firląg | Helena Skowronska | Michał Satława | Piotr Andruszkiewicz
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We present a novel dataset and model for a multilingual setting to approach the task of Joint Entity and Relation Extraction. The SMiLER dataset consists of 1.1 M annotated sentences, representing 36 relations, and 14 languages. To the best of our knowledge, this is currently both the largest and the most comprehensive dataset of this type. We introduce HERBERTa, a pipeline that combines two independent BERT models: one for sequence classification, and the other for entity tagging. The model achieves micro F1 81.49 for English on this dataset, which is close to the current SOTA on CoNLL, SpERT.

pdf bib
SRPOL DIALOGUE SYSTEMS at SemEval-2021 Task 5: Automatic Generation of Training Data for Toxic Spans Detection
Michał Satława | Katarzyna Zamłyńska | Jarosław Piersa | Joanna Kolis | Klaudia Firląg | Katarzyna Beksa | Zuzanna Bordzicka | Christian Goltz | Paweł Bujnowski | Piotr Andruszkiewicz
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper presents a system used for SemEval-2021 Task 5: Toxic Spans Detection. Our system is an ensemble of BERT-based models for binary word classification, trained on a dataset extended by toxic comments modified and generated by two language models. For the toxic word classification, the prediction threshold value was optimized separately for every comment, in order to maximize the expected F1 value.


pdf bib
WUT at SemEval-2019 Task 9: Domain-Adversarial Neural Networks for Domain Adaptation in Suggestion Mining
Mateusz Klimaszewski | Piotr Andruszkiewicz
Proceedings of the 13th International Workshop on Semantic Evaluation

We present a system for cross-domain suggestion mining, prepared for the SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums (Subtask B). Our submitted solution for this text classification problem explores the idea of treating different suggestions’ sources as one of the settings of Transfer Learning - Domain Adaptation. Our experiments show that without any labeled target domain examples during training time, we are capable of proposing a system, reaching up to 0.778 in terms of F1 score on test dataset, based on Target Preserving Domain Adversarial Neural Networks.


pdf bib
Annotated Corpus of Scientific Conference’s Homepages for Information Extraction
Piotr Andruszkiewicz | Rafał Hazan
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


pdf bib
Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity.
Barbara Rychalska | Katarzyna Pakulska | Krystyna Chodorowska | Wojciech Walczak | Piotr Andruszkiewicz
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)