Sebastian Ebert


2021

pdf bib
We Need To Talk About Random Splits
Anders Søgaard | Sebastian Ebert | Jasmijn Bastings | Katja Filippova
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

(CITATION) argued for using random splits rather than standard splits in NLP experiments. We argue that random splits, like standard splits, lead to overly optimistic performance estimates. We can also split data in biased or adversarial ways, e.g., training on short sentences and evaluating on long ones. Biased sampling has been used in domain adaptation to simulate real-world drift; this is known as the covariate shift assumption. In NLP, however, even worst-case splits, maximizing bias, often under-estimate the error observed on new samples of in-domain data, i.e., the data that models should minimally generalize to at test time. This invalidates the covariate shift assumption. Instead of using multiple random splits, future benchmarks should ideally include multiple, independent test sets instead; if infeasible, we argue that multiple biased splits leads to more realistic performance estimates than multiple random splits.

pdf bib
Don’t Search for a Search Method — Simple Heuristics Suffice for Adversarial Text Attacks
Nathaniel Berger | Stefan Riezler | Sebastian Ebert | Artem Sokolov
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently more attention has been given to adversarial attacks on neural networks for natural language processing (NLP). A central research topic has been the investigation of search algorithms and search constraints, accompanied by benchmark algorithms and tasks. We implement an algorithm inspired by zeroth order optimization-based attacks and compare with the benchmark results in the TextAttack framework. Surprisingly, we find that optimization-based methods do not yield any improvement in a constrained setup and slightly benefit from approximate gradient information only in unconstrained setups where search spaces are larger. In contrast, simple heuristics exploiting nearest neighbors without querying the target function yield substantial success rates in constrained setups, and nearly full success rate in unconstrained setups, at an order of magnitude fewer queries. We conclude from these results that current TextAttack benchmark tasks are too easy and constraints are too strict, preventing meaningful research on black-box adversarial text attacks.

2016

pdf bib
LAMB: A Good Shepherd of Morphologically Rich Languages
Sebastian Ebert | Thomas Müller | Hinrich Schütze
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Ultradense Word Embeddings by Orthogonal Transformation
Sascha Rothe | Sebastian Ebert | Hinrich Schütze
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Attention-Based Convolutional Neural Network for Machine Comprehension
Wenpeng Yin | Sebastian Ebert | Hinrich Schütze
Proceedings of the Workshop on Human-Computer Question Answering

2015

pdf bib
A Linguistically Informed Convolutional Neural Network
Sebastian Ebert | Ngoc Thang Vu | Hinrich Schütze
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
CIS-positive: A Combination of Convolutional Neural Networks and Support Vector Machines for Sentiment Analysis in Twitter
Sebastian Ebert | Ngoc Thang Vu | Hinrich Schütze
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Fine-Grained Contextual Predictions for Hard Sentiment Words
Sebastian Ebert | Hinrich Schütze
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)