Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop

Ionut-Teodor Sorodoc, Madhumita Sushil, Ece Takmaz, Eneko Agirre (Editors)

Anthology ID:
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Ionut-Teodor Sorodoc | Madhumita Sushil | Ece Takmaz | Eneko Agirre

pdf bib
Computationally Efficient Wasserstein Loss for Structured Labels
Ayato Toyokuni | Sho Yokoi | Hisashi Kashima | Makoto Yamada

The problem of estimating the probability distribution of labels has been widely studied as a label distribution learning (LDL) problem, whose applications include age estimation, emotion analysis, and semantic segmentation. We propose a tree-Wasserstein distance regularized LDL algorithm, focusing on hierarchical text classification tasks. We propose predicting the entire label hierarchy using neural networks, where the similarity between predicted and true labels is measured using the tree-Wasserstein distance. Through experiments using synthetic and real-world datasets, we demonstrate that the proposed method successfully considers the structure of labels during training, and it compares favorably with the Sinkhorn algorithm in terms of computation time and memory usage.

pdf bib
Have Attention Heads in BERT Learned Constituency Grammar?
Ziyang Luo

With the success of pre-trained language models in recent years, more and more researchers focus on opening the “black box” of these models. Following this interest, we carry out a qualitative and quantitative analysis of constituency grammar in attention heads of BERT and RoBERTa. We employ the syntactic distance method to extract implicit constituency grammar from the attention weights of each head. Our results show that there exist heads that can induce some grammar types much better than baselines, suggesting that some heads act as a proxy for constituency grammar. We also analyze how attention heads’ constituency grammar inducing (CGI) ability changes after fine-tuning with two kinds of tasks, including sentence meaning similarity (SMS) tasks and natural language inference (NLI) tasks. Our results suggest that SMS tasks decrease the average CGI ability of upper layers, while NLI tasks increase it. Lastly, we investigate the connections between CGI ability and natural language understanding ability on QQP and MNLI tasks.

pdf bib
Do we read what we hear? Modeling orthographic influences on spoken word recognition
Nicole Macher | Badr M. Abdullah | Harm Brouwer | Dietrich Klakow

Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form. There is consensus that phonological and semantic information is crucial for this process. However, there is accumulating evidence that orthographic information could also have an impact on auditory word recognition. This paper presents two models of spoken word recognition that instantiate different hypotheses regarding the influence of orthography on this process. We show that these models reproduce human-like behavior in different ways and provide testable hypotheses for future research on the source of orthographic effects in spoken word recognition.

pdf bib
PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation
Dimitris Papadopoulos | Nikolaos Papadakis | Nikolaos Matsatsinis

In this work, we present a methodology that aims at bridging the gap between high and low-resource languages in the context of Open Information Extraction, showcasing it on the Greek language. The goals of this paper are twofold: First, we build Neural Machine Translation (NMT) models for English-to-Greek and Greek-to-English based on the Transformer architecture. Second, we leverage these NMT models to produce English translations of Greek text as input for our NLP pipeline, to which we apply a series of pre-processing and triple extraction tasks. Finally, we back-translate the extracted triples to Greek. We conduct an evaluation of both our NMT and OIE methods on benchmark datasets and demonstrate that our approach outperforms the current state-of-the-art for the Greek natural language.

pdf bib
A Computational Analysis of Vagueness in Revisions of Instructional Texts
Alok Debnath | Michael Roth

WikiHow is an open-domain repository of instructional articles for a variety of tasks, which can be revised by users. In this paper, we extract pairwise versions of an instruction before and after a revision was made. Starting from a noisy dataset of revision histories, we specifically extract and analyze edits that involve cases of vagueness in instructions. We further investigate the ability of a neural model to distinguish between two versions of an instruction in our data by adopting a pairwise ranking task from previous work and showing improvements over existing baselines.

pdf bib
A reproduction of Apple’s bi-directional LSTM models for language identification in short strings
Mads Toftrup | Søren Asger Sørensen | Manuel R. Ciosici | Ira Assent

Language Identification is the task of identifying a document’s language. For applications like automatic spell checker selection, language identification must use very short strings such as text message fragments. In this work, we reproduce a language identification architecture that Apple briefly sketched in a blog post. We confirm the bi-LSTM model’s performance and find that it outperforms current open-source language identifiers. We further find that its language identification mistakes are due to confusion between related languages.

pdf bib
Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings
Nazmul Kazi | Nathaniel Lane | Indika Kahanda

Institutes are required to catalog their articles with proper subject headings so that the users can easily retrieve relevant articles from the institutional repositories. However, due to the rate of proliferation of the number of articles in these repositories, it is becoming a challenge to manually catalog the newly added articles at the same pace. To address this challenge, we explore the feasibility of automatically annotating articles with Library of Congress Subject Headings (LCSH). We first use web scraping to extract keywords for a collection of articles from the Repository Analytics and Metrics Portal (RAMP). Then, we map these keywords to LCSH names for developing a gold-standard dataset. As a case study, using the subset of Biology-related LCSH concepts, we develop predictive models by formulating this task as a multi-label classification problem. Our experimental results demonstrate the viability of this approach for predicting LCSH for scholarly articles.

pdf bib
Model Agnostic Answer Reranking System for Adversarial Question Answering
Sagnik Majumder | Chinmoy Samant | Greg Durrett

While numerous methods have been proposed as defenses against adversarial examples in question answering (QA), these techniques are often model specific, require retraining of the model, and give only marginal improvements in performance over vanilla models. In this work, we present a simple model-agnostic approach to this problem that can be applied directly to any QA model without any retraining. Our method employs an explicit answer candidate reranking mechanism that scores candidate answers on the basis of their content overlap with the question before making the final prediction. Combined with a strong base QAmodel, our method outperforms state-of-the-art defense techniques, calling into question how well these techniques are actually doing and strong these adversarial testbeds are.

pdf bib
BERT meets Cranfield: Uncovering the Properties of Full Ranking on Fully Labeled Data
Negin Ghasemi | Djoerd Hiemstra

Recently, various information retrieval models have been proposed based on pre-trained BERT models, achieving outstanding performance. The majority of such models have been tested on data collections with partial relevance labels, where various potentially relevant documents have not been exposed to the annotators. Therefore, evaluating BERT-based rankers may lead to biased and unfair evaluation results, simply because a relevant document has not been exposed to the annotators while creating the collection. In our work, we aim to better understand a BERT-based ranker’s strengths compared to a BERT-based re-ranker and the initial ranker. To this aim, we investigate BERT-based rankers performance on the Cranfield collection, which comes with full relevance judgment on all documents in the collection. Our results demonstrate the BERT-based full ranker’s effectiveness, as opposed to the BERT-based re-ranker and BM25. Also, analysis shows that there are documents that the BERT-based full-ranker finds that were not found by the initial ranker.

pdf bib
Siamese Neural Networks for Detecting Complementary Products
Marina Angelovska | Sina Sheikholeslami | Bas Dunn | Amir H. Payberah

Recommender systems play an important role in e-commerce websites as they improve the customer journey by helping the users find what they want at the right moment. In this paper, we focus on identifying a complementary relationship between the products of an e-commerce company. We propose a content-based recommender system for detecting complementary products, using Siamese Neural Networks (SNN). To this end, we implement and compare two different models: Siamese Convolutional Neural Network (CNN) and Siamese Long Short-Term Memory (LSTM). Moreover, we propose an extension of the SNN approach to handling millions of products in a matter of seconds, and we reduce the training time complexity by half. In the experiments, we show that Siamese LSTM can predict complementary products with an accuracy of ~85% using only the product titles.

pdf bib
Contrasting distinct structured views to learn sentence embeddings
Antoine Simoulin | Benoit Crabbé

We propose a self-supervised method that builds sentence embeddings from the combination of diverse explicit syntactic structures of a sentence. We assume structure is crucial to building consistent representations as we expect sentence meaning to be a function of both syntax and semantic aspects. In this perspective, we hypothesize that some linguistic representations might be better adapted given the considered task or sentence. We, therefore, propose to learn individual representation functions for different syntactic frameworks jointly. Again, by hypothesis, all such functions should encode similar semantic information differently and consequently, be complementary for building better sentential semantic embeddings. To assess such hypothesis, we propose an original contrastive multi-view framework that induces an explicit interaction between models during the training phase. We make experiments combining various structures such as dependency, constituency, or sequential schemes. Our results outperform comparable methods on several tasks from standard sentence embedding benchmarks.

pdf bib
Discrete Reasoning Templates for Natural Language Understanding
Hadeel Al-Negheimish | Pranava Madhyastha | Alessandra Russo

Reasoning about information from multiple parts of a passage to derive an answer is an open challenge for reading-comprehension models. In this paper, we present an approach that reasons about complex questions by decomposing them to simpler subquestions that can take advantage of single-span extraction reading-comprehension models, and derives the final answer according to instructions in a predefined reasoning template. We focus on subtraction based arithmetic questions and evaluate our approach on a subset of the DROP dataset. We show that our approach is competitive with the state of the art while being interpretable and requires little supervision.

pdf bib
Multilingual Email Zoning
Bruno Jardim | Ricardo Rei | Mariana S. C. Almeida

The segmentation of emails into functional zones (also dubbed email zoning) is a relevant preprocessing step for most NLP tasks that deal with emails. However, despite the multilingual character of emails and their applications, previous literature regarding email zoning corpora and systems was developed essentially for English. In this paper, we analyse the existing email zoning corpora and propose a new multilingual benchmark composed of 625 emails in Portuguese, Spanish and French. Moreover, we introduce OKAPI, the first multilingual email segmentation model based on a language agnostic sentence encoder. Besides generalizing well for unseen languages, our model is competitive with current English benchmarks, and reached new state-of-the-art performances for domain adaptation tasks in English.

pdf bib
Familiar words but strange voices: Modelling the influence of speech variability on word recognition
Alexandra Mayn | Badr M. Abdullah | Dietrich Klakow

We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener. Furthermore, we investigate the influence of variability in speech signals on the model’s performance. To this end, we conduct of set of controlled experiments using word-aligned read speech data in German. Our experiments show that (1) the model is more sensitive to dialectical variation than gender variation, and (2) recognition performance of word cognates from related languages reflect the degree of relatedness between languages in our study. Our work highlights the feasibility of modeling human speech perception using deep neural networks.

pdf bib
Emoji-Based Transfer Learning for Sentiment Tasks
Susann Boy | Dana Ruiter | Dietrich Klakow

Sentiment tasks such as hate speech detection and sentiment analysis, especially when performed on languages other than English, are often low-resource. In this study, we exploit the emotional information encoded in emojis to enhance the performance on a variety of sentiment tasks. This is done using a transfer learning approach, where the parameters learned by an emoji-based source task are transferred to a sentiment target task. We analyse the efficacy of the transfer under three conditions, i.e. i) the emoji content and ii) label distribution of the target task as well as iii) the difference between monolingually and multilingually learned source tasks. We find i.a. that the transfer is most beneficial if the target task is balanced with high emoji content. Monolingually learned source tasks have the benefit of taking into account the culturally specific use of emojis and gain up to F1 +0.280 over the baseline.

pdf bib
A Little Pretraining Goes a Long Way: A Case Study on Dependency Parsing Task for Low-resource Morphologically Rich Languages
Jivnesh Sandhan | Amrith Krishna | Ashim Gupta | Laxmidhar Behera | Pawan Goyal

Neural dependency parsing has achieved remarkable performance for many domains and languages. The bottleneck of massive labelled data limits the effectiveness of these approaches for low resource languages. In this work, we focus on dependency parsing for morphological rich languages (MRLs) in a low-resource setting. Although morphological information is essential for the dependency parsing task, the morphological disambiguation and lack of powerful analyzers pose challenges to get this information for MRLs. To address these challenges, we propose simple auxiliary tasks for pretraining. We perform experiments on 10 MRLs in low-resource settings to measure the efficacy of our proposed pretraining method and observe an average absolute gain of 2 points (UAS) and 3.6 points (LAS).

pdf bib
Development of Conversational AI for Sleep Coaching Programme
Heereen Shim

Almost 30% of the adult population in the world is experiencing or has experience insomnia. Cognitive Behaviour Therapy for insomnia (CBT-I) is one of the most effective treatment, but it has limitations on accessibility and availability. Utilising technology is one of the possible solutions, but existing methods neglect conversational aspects, which plays a critical role in sleep therapy. To address this issue, we propose a PhD project exploring potentials of developing conversational artificial intelligence (AI) for a sleep coaching programme, which is motivated by CBT-I treatment. This PhD project aims to develop natural language processing (NLP) algorithms to allow the system to interact naturally with a user and provide automated analytic system to support human experts. In this paper, we introduce research questions lying under three phases of the sleep coaching programme: triaging, monitoring the progress, and providing coaching. We expect this research project’s outcomes could contribute to the research domains of NLP and AI but also the healthcare field by providing a more accessible and affordable sleep treatment solution and an automated analytic system to lessen the burden of human experts.

pdf bib
Relating Relations: Meta-Relation Extraction from Online Health Forum Posts
Daniel Stickley

Relation extraction is a key task in knowledge extraction, and is commonly defined as the task of identifying relations that hold between entities in text. This thesis proposal addresses the specific task of identifying meta-relations, a higher order family of relations naturally construed as holding between other relations which includes temporal, comparative, and causal relations. More specifically, we aim to develop theoretical underpinnings and practical solutions for the challenges of (1) incorporating meta-relations into conceptualisations and annotation schemes for (lower-order) relations and named entities, (2) obtaining annotations for them with tolerable cognitive load on annotators, (3) creating models capable of reliably extracting meta-relations, and related to that (4) addressing the limited-data problem exacerbated by the introduction of meta-relations into the learning task. We explore recent works in relation extraction and discuss our plans to formally conceptualise meta-relations for the domain of user-generated health texts, and create a new dataset, annotation scheme and models for meta-relation extraction.

pdf bib
Towards Personalised and Document-level Machine Translation of Dialogue
Sebastian Vincent

State-of-the-art (SOTA) neural machine translation (NMT) systems translate texts at sentence level, ignoring context: intra-textual information, like the previous sentence, and extra-textual information, like the gender of the speaker. As a result, some sentences are translated incorrectly. Personalised NMT (PersNMT) and document-level NMT (DocNMT) incorporate this information into the translation process. Both fields are relatively new and previous work within them is limited. Moreover, there are no readily available robust evaluation metrics for them, which makes it difficult to develop better systems, as well as track global progress and compare different methods. This thesis proposal focuses on PersNMT and DocNMT for the domain of dialogue extracted from TV subtitles in five languages: English, Brazilian Portuguese, German, French and Polish. Three main challenges are addressed: (1) incorporating extra-textual information directly into NMT systems; (2) improving the machine translation of cohesion devices; (3) reliable evaluation for PersNMT and DocNMT.

pdf bib
Semantic-aware transformation of short texts using word embeddings: An application in the Food Computing domain
Andrea Morales-Garzón | Juan Gómez-Romero | Maria J. Martin-Bautista

Most works in food computing focus on generating new recipes from scratch. However, there is a large number of new online recipes generated daily with a large number of users reviews, with recommendations to improve the recipe flavor and ideas to modify them. This fact encourages the use of these data for obtaining improved and customized versions. In this thesis, we propose an adaptation engine based on fine-tuning a word embedding model. We will capture, in an unsupervised way, the semantic meaning of the recipe ingredients. We will use their word embedding representations to align them to external databases, thus enriching their data. The adaptation engine will use this food data to modify a recipe into another fitting specific user preferences (e.g., decrease caloric intake or make a recipe). We plan to explore different types of recipe adaptations while preserving recipe essential features such as cuisine style and essence simultaneously. We will also modify the rest of the recipe to the new changes to be reproducible.

pdf bib
TMR: Evaluating NER Recall on Tough Mentions
Jingxuan Tu | Constantine Lignos

We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation by examining recall on specific subsets of ”tough” mentions: unseen mentions, those whose tokens or token/type combination were not observed in training, and type-confusable mentions, token sequences with multiple entity types in the test data. We demonstrate the usefulness of these metrics by evaluating corpora of English, Spanish, and Dutch using five recent neural architectures. We identify subtle differences between the performance of BERT and Flair on two English NER corpora and identify a weak spot in the performance of current models in Spanish. We conclude that the TMR metrics enable differentiation between otherwise similar-scoring systems and identification of patterns in performance that would go unnoticed from overall precision, recall, and F1.

pdf bib
The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation
Jonne Saleva | Constantine Lignos

This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting. We compare segmentations produced by applying BPE at the token or sentence level with morphologically-based segmentations from LMVR and MORSEL. We evaluate translation tasks between English and each of Nepali, Sinhala, and Kazakh, and predict that using morphologically-based segmentation methods would lead to better performance in this setting. However, comparing to BPE, we find that no consistent and reliable differences emerge between the segmentation methods. While morphologically-based methods outperform BPE in a few cases, what performs best tends to vary across tasks, and the performance of segmentation methods is often statistically indistinguishable.

pdf bib
Making Use of Latent Space in Language GANs for Generating Diverse Text without Pre-training
Takeshi Kojima | Yusuke Iwasawa | Yutaka Matsuo

Generating diverse texts is an important factor for unsupervised text generation. One approach is to produce the diversity of texts conditioned by the sampled latent code. Although several generative adversarial networks (GANs) have been proposed thus far, these models still suffer from mode-collapsing if the models are not pre-trained. In this paper, we propose a GAN model that aims to improve the approach to generating diverse texts conditioned by the latent space. The generator of our model uses Gumbel-Softmax distribution for the word sampling process. To ensure that the text is generated conditioned upon the sampled latent code, reconstruction loss is introduced in our objective function. The discriminator of our model iteratively inspects incomplete partial texts and learns to distinguish whether they are real or fake by using the standard GAN objective function. Experimental results using the COCO Image Captions dataset show that, although our model is not pre-trained, the performance of our model is quite competitive with the existing baseline models, which requires pre-training.

pdf bib
Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers
Liina Repo | Valtteri Skantsi | Samuel Rönnqvist | Saara Hellström | Miika Oinonen | Anna Salmela | Douglas Biber | Jesse Egbert | Sampo Pyysalo | Veronika Laippala

We explore cross-lingual transfer of register classification for web documents. Registers, that is, text varieties such as blogs or news are one of the primary predictors of linguistic variation and thus affect the automatic processing of language. We introduce two new register-annotated corpora, FreCORE and SweCORE, for French and Swedish. We demonstrate that deep pre-trained language models perform strongly in these languages and outperform previous state-of-the-art in English and Finnish. Specifically, we show 1) that zero-shot cross-lingual transfer from the large English CORE corpus can match or surpass previously published monolingual models, and 2) that lightweight monolingual classification requiring very little training data can reach or surpass our zero-shot performance. We further analyse classification results finding that certain registers continue to pose challenges in particular for cross-lingual transfer.

pdf bib
Explaining and Improving BERT Performance on Lexical Semantic Change Detection
Severin Laicher | Sinan Kurtyigit | Dominik Schlechtweg | Jonas Kuhn | Sabine Schulte im Walde

Type- and token-based embedding architectures are still competing in lexical semantic change detection. The recent success of type-based models in SemEval-2020 Task 1 has raised the question why the success of token-based models on a variety of other NLP tasks does not translate to our field. We investigate the influence of a range of variables on clusterings of BERT vectors and show that its low performance is largely due to orthographic information on the target word, which is encoded even in the higher layers of BERT representations. By reducing the influence of orthography we considerably improve BERT’s performance.

pdf bib
Why Find the Right One?
Payal Khullar

The present paper investigates the impact of the anaphoric one words in English on the Neural Machine Translation (NMT) process using English-Hindi as source and target language pair. As expected, the experimental results show that the state-of-the-art Google English-Hindi NMT system achieves significantly poorly on sentences containing anaphoric ones as compared to the sentences containing regular, non-anaphoric ones. But, more importantly, we note that amongst the anaphoric words, the noun class is clearly much harder for NMT than the determinatives. This reaffirms the linguistic disparity of the two phenomenon in recent theoretical syntactic literature, despite the obvious surface similarities.