Thiago Castro Ferreira

Also published as: Thiago Castro Ferreira, Thiago Castro-Ferreira, Thiago Ferreira


2022

pdf bib
Anotação de textos não canônicos: um estudo exploratorio de Grande sertão: veredas pelas dependências universais
Andre V. L. Coneglian | Ana Luisa A. R. Guimarães | Thiago Castro Ferreira | Adriana S. Pagano
Proceedings of the Universal Dependencies Brazilian Festival

2021

pdf bib
Enriching the E2E dataset
Thiago Castro Ferreira | Helena Vaz | Brian Davis | Adriana Pagano
Proceedings of the 14th International Conference on Natural Language Generation

This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicalization and referring expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning non-linguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available.

pdf bib
Another PASS: A Reproduction Study of the Human Evaluation of a Football Report Generation System
Simon Mille | Thiago Castro Ferreira | Anya Belz | Brian Davis
Proceedings of the 14th International Conference on Natural Language Generation

This paper reports results from a reproduction study in which we repeated the human evaluation of the PASS Dutch-language football report generation system (van der Lee et al., 2017). The work was carried out as part of the ReproGen Shared Task on Reproducibility of Human Evaluations in NLG, in Track A (Paper 1). We aimed to repeat the original study exactly, with the main difference that a different set of evaluators was used. We describe the study design, present the results from the original and the reproduction study, and then compare and analyse the differences between the two sets of results. For the two ‘headline’ results of average Fluency and Clarity, we find that in both studies, the system was rated more highly for Clarity than for Fluency, and Clarity had higher standard deviation. Clarity and Fluency ratings were higher, and their standard deviations lower, in the reproduction study than in the original study by substantial margins. Clarity had a higher degree of reproducibility than Fluency, as measured by the coefficient of variation. Data and code are publicly available.

pdf bib
Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles
Felipe Araújo de Britto | Thiago Castro Ferreira | Leonardo Pereira Nunes | Fernando Silva Parreiras
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Written communication is of utmost importance to the progress of scientific research. The speed of such development, however, may be affected by the scarcity of reviewers to referee the quality of research articles. In this context, automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity. This paper aims to compare supervised machine learning techniques tested to accomplish genre analysis in Introduction sections of software engineering articles. A semi-supervised approach was carried out to augment the number of annotated sentences in SciSents (Avaliable on: ANONYMOUS). Two supervised approaches using SVM and logistic regression were undertaken to assess the F-score for genre analysis in the corpus. A technique based on logistic regression and BERT has been found to perform genre analysis highly satisfactorily with an average of 88.25 on F-score when retrieving patterns at an overall level.

pdf bib
Evaluating Recognizing Question Entailment Methods for a Portuguese Community Question-Answering System about Diabetes Mellitus
Thiago Castro Ferreira | João Victor de Pinho Costa | Isabela Rigotto | Vitoria Portella | Gabriel Frota | Ana Luisa A. R. Guimarães | Adalberto Penna | Isabela Lee | Tayane A. Soares | Sophia Rolim | Rossana Cunha | Celso França | Ariel Santos | Rivaney F. Oliveira | Abisague Langbehn | Daniel Hasan Dalip | Marcos André Gonçalves | Rodrigo Bastos Fóscolo | Adriana Pagano
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

This study describes the development of a Portuguese Community-Question Answering benchmark in the domain of Diabetes Mellitus using a Recognizing Question Entailment (RQE) approach. Given a premise question, RQE aims to retrieve semantically similar, already answered, archived questions. We build a new Portuguese benchmark corpus with 785 pairs between premise questions and archived answered questions marked with relevance judgments by medical experts. Based on the benchmark corpus, we leveraged and evaluated several RQE approaches ranging from traditional information retrieval methods to novel large pre-trained language models and ensemble techniques using learn-to-rank approaches. Our experimental results show that a supervised transformer-based method trained with multiple languages and for multiple tasks (MUSE) outperforms the alternatives. Our results also show that ensembles of methods (stacking) as well as a traditional (light) information retrieval method (BM25) can produce competitive results. Finally, among the tested strategies, those that exploit only the question (not the answer), provide the best effectiveness-efficiency trade-off. Code is publicly available.

2020

pdf bib
Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation
Emiel van Miltenburg | Chris van der Lee | Thiago Castro-Ferreira | Emiel Krahmer
Proceedings of the 1st Workshop on Evaluating NLG Evaluation

NLG researchers often use uncontrolled corpora to train and evaluate their systems, using textual similarity metrics, such as BLEU. This position paper argues in favour of two alternative evaluation strategies, using grammars or rule-based systems. These strategies are particularly useful to identify the strengths and weaknesses of different systems. We contrast our proposals with the (extended) WebNLG dataset, which is revealed to have a skewed distribution of predicates. We predict that this distribution affects the quality of the predictions for systems trained on this data. However, this hypothesis can only be thoroughly tested (without any confounds) once we are able to systematically manipulate the skewness of the data, using a rule-based approach.

pdf bib
Referring to what you know and do not know: Making Referring Expression Generation Models Generalize To Unseen Entities
Rossana Cunha | Thiago Castro Ferreira | Adriana Pagano | Fabio Alves
Proceedings of the 28th International Conference on Computational Linguistics

Data-to-text Natural Language Generation (NLG) is the computational process of generating natural language in the form of text or voice from non-linguistic data. A core micro-planning task within NLG is referring expression generation (REG), which aims to automatically generate noun phrases to refer to entities mentioned as discourse unfolds. A limitation of novel REG models is not being able to generate referring expressions to entities not encountered during the training process. To solve this problem, we propose two extensions to NeuralREG, a state-of-the-art encoder-decoder REG model. The first is a copy mechanism, whereas the second consists of representing the gender and type of the referent as inputs to the model. Drawing on the results of automatic and human evaluation as well as an ablation study using the WebNLG corpus, we contend that our proposal contributes to the generation of more meaningful referring expressions to unseen entities than the original system and related work. Code and all produced data are publicly available.

pdf bib
Building The First English-Brazilian Portuguese Corpus for Automatic Post-Editing
Felipe Almeida Costa | Thiago Castro Ferreira | Adriana Pagano | Wagner Meira
Proceedings of the 28th International Conference on Computational Linguistics

This paper introduces the first corpus for Automatic Post-Editing of English and a low-resource language, Brazilian Portuguese. The source English texts were extracted from the WebNLG corpus and automatically translated into Portuguese using a state-of-the-art industrial neural machine translator. Post-edits were then obtained in an experiment with native speakers of Brazilian Portuguese. To assess the quality of the corpus, we performed error analysis and computed complexity indicators measuring how difficult the APE task would be. We report preliminary results of Phrase-Based and Neural Machine Translation Models on this new corpus. Data and code publicly available in our repository.

pdf bib
Proceedings of the Third Workshop on Multilingual Surface Realisation
Anya Belz | Bernd Bohnet | Thiago Castro Ferreira | Yvette Graham | Simon Mille | Leo Wanner
Proceedings of the Third Workshop on Multilingual Surface Realisation

pdf bib
The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results
Simon Mille | Anya Belz | Bernd Bohnet | Thiago Castro Ferreira | Yvette Graham | Leo Wanner
Proceedings of the Third Workshop on Multilingual Surface Realisation

This paper presents results from the Third Shared Task on Multilingual Surface Realisation (SR’20) which was organised as part of the COLING’20 Workshop on Multilingual Surface Realisation. As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed. Moreover, each track had two subtracks: (a) restricted-resource, where only the data provided or approved as part of a track could be used for training models, and (b) open-resource, where any data could be used. The Shallow Track was offered in 11 languages, whereas the Deep Track in 3 ones. Systems were evaluated using both automatic metrics and direct assessment by human evaluators in terms of Readability and Meaning Similarity to reference outputs. We present the evaluation results, along with descriptions of the SR’19 tracks, data and evaluation methods, as well as brief summaries of the participating systems. For full descriptions of the participating systems, please see the separate system reports elsewhere in this volume.

pdf bib
DaMata: A Robot-Journalist Covering the Brazilian Amazon Deforestation
André Luiz Rosa Teixeira | João Campos | Rossana Cunha | Thiago Castro Ferreira | Adriana Pagano | Fabio Cozman
Proceedings of the 13th International Conference on Natural Language Generation

This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Amazon. The robot-journalist is based on a pipeline architecture of Natural Language Generation, which yields multilingual daily and monthly reports based on the public data provided by DETER, a real-time deforestation satellite monitor developed and maintained by the Brazilian National Institute for Space Research (INPE). DaMata automatically generates reports in Brazilian Portuguese and English and publishes them on the Twitter platform. Corpus and code are publicly available.

pdf bib
Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)
Thiago Castro Ferreira | Claire Gardent | Nikolai Ilinykh | Chris van der Lee | Simon Mille | Diego Moussallem | Anastasia Shimorina
Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)

pdf bib
A General Benchmarking Framework for Text Generation
Diego Moussallem | Paramjot Kaur | Thiago Ferreira | Chris van der Lee | Anastasia Shimorina | Felix Conrads | Michael Röder | René Speck | Claire Gardent | Simon Mille | Nikolai Ilinykh | Axel-Cyrille Ngonga Ngomo
Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)

The RDF-to-text task has recently gained substantial attention due to the continuous growth of RDF knowledge graphs in number and size. Recent studies have focused on systematically comparing RDF-to-text approaches on benchmarking datasets such as WebNLG. Although some evaluation tools have already been proposed for text generation, none of the existing solutions abides by the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles and involves RDF data for the knowledge extraction task. In this paper, we present BENG, a FAIR benchmarking platform for Natural Language Generation (NLG) and Knowledge Extraction systems with focus on RDF data. BENG builds upon the successful benchmarking platform GERBIL, is opensource and is publicly available along with the data it contains.

pdf bib
The 2020 Bilingual, Bi-Directional WebNLG+ Shared Task: Overview and Evaluation Results (WebNLG+ 2020)
Thiago Castro Ferreira | Claire Gardent | Nikolai Ilinykh | Chris van der Lee | Simon Mille | Diego Moussallem | Anastasia Shimorina
Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)

WebNLG+ offers two challenges: (i) mapping sets of RDF triples to English or Russian text (generation) and (ii) converting English or Russian text to sets of RDF triples (semantic parsing). Compared to the eponymous WebNLG challenge, WebNLG+ provides an extended dataset that enable the training, evaluation, and comparison of microplanners and semantic parsers. In this paper, we present the results of the generation and semantic parsing task for both English and Russian and provide a brief description of the participating systems.

2019

pdf bib
Neural data-to-text generation: A comparison between pipeline and end-to-end architectures
Thiago Castro Ferreira | Chris van der Lee | Emiel van Miltenburg | Emiel Krahmer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformations. By contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in between. This study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. Both architectures were implemented making use of the encoder-decoder Gated-Recurrent Units (GRU) and Transformer, two state-of-the art deep learning methods. Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches. Moreover, the pipeline models generalize better to unseen inputs. Data and code are publicly available.

pdf bib
Surface Realization Shared Task 2019 (MSR19): The Team 6 Approach
Thiago Castro Ferreira | Emiel Krahmer
Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019)

This study describes the approach developed by the Tilburg University team to the shallow track of the Multilingual Surface Realization Shared Task 2019 (SR’19) (Mille et al., 2019). Based on Ferreira et al. (2017) and on our 2018 submission Ferreira et al. (2018), the approach generates texts by first preprocessing an input dependency tree into an ordered linearized string, which is then realized using a rule-based and a statistical machine translation (SMT) model. This year our submission is able to realize texts in the 11 languages proposed for the task, different from our last year submission, which covered only 6 Indo-European languages. The model is publicly available.

pdf bib
Question Similarity in Community Question Answering: A Systematic Exploration of Preprocessing Methods and Models
Florian Kunneman | Thiago Castro Ferreira | Emiel Krahmer | Antal van den Bosch
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Community Question Answering forums are popular among Internet users, and a basic problem they encounter is trying to find out if their question has already been posed before. To address this issue, NLP researchers have developed methods to automatically detect question-similarity, which was one of the shared tasks in SemEval. The best performing systems for this task made use of Syntactic Tree Kernels or the SoftCosine metric. However, it remains unclear why these methods seem to work, whether their performance can be improved by better preprocessing methods and what kinds of errors they (and other methods) make. In this paper, we therefore systematically combine and compare these two approaches with the more traditional BM25 and translation-based models. Moreover, we analyze the impact of preprocessing steps (lowercasing, suppression of punctuation and stop words removal) and word meaning similarity based on different distributions (word translation probability, Word2Vec, fastText and ELMo) on the performance of the task. We conduct an error analysis to gain insight into the differences in performance between the system set-ups. The implementation is made publicly available from https://github.com/fkunneman/DiscoSumo/tree/master/ranlp.

2018

pdf bib
Surface Realization Shared Task 2018 (SR18): The Tilburg University Approach
Thiago Castro Ferreira | Sander Wubben | Emiel Krahmer
Proceedings of the First Workshop on Multilingual Surface Realisation

This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18). Based on (Castro Ferreira et al., 2017), the approach works by first preprocessing an input dependency tree into an ordered linearized string, which is then realized using a statistical machine translation model. Our approach shows promising results, with BLEU scores above 50 for 5 different languages (English, French, Italian, Portuguese and Spanish) and above 35 for the Dutch language.

pdf bib
Enriching the WebNLG corpus
Thiago Castro Ferreira | Diego Moussallem | Emiel Krahmer | Sander Wubben
Proceedings of the 11th International Conference on Natural Language Generation

This paper describes the enrichment of WebNLG corpus (Gardent et al., 2017a,b), with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation. We also produce a silver-standard German translation of the corpus to enable the exploitation of NLG approaches to other languages than English. The enriched corpus is publicly available.

pdf bib
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Diego Moussallem | Thiago Ferreira | Marcos Zampieri | Maria Claudia Cavalcanti | Geraldo Xexéo | Mariana Neves | Axel-Cyrille Ngonga Ngomo
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
NeuralREG: An end-to-end approach to referring expression generation
Thiago Castro Ferreira | Diego Moussallem | Ákos Kádár | Sander Wubben | Emiel Krahmer
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function. In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction. Using a delexicalized version of the WebNLG corpus, we show that the neural model substantially improves over two strong baselines.

2017

pdf bib
Generating flexible proper name references in text: Data, models and evaluation
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the discourse context and variation. The model relies on the REGnames corpus, a dataset with 53,102 proper name references to 1,000 people in different discourse contexts. We evaluate the versions of our model from the perspective of how human writers produce proper names, and also how human readers process them. The corpus and the model are publicly available.

pdf bib
Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation
Thiago Castro Ferreira | Iacer Calixto | Sander Wubben | Emiel Krahmer
Proceedings of the 10th International Conference on Natural Language Generation

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT). We systematically study the effects of 3 AMR preprocessing steps (Delexicalisation, Compression, and Linearisation) applied before the MT phase. Our results show that preprocessing indeed helps, although the benefits differ for the two MT models.

pdf bib
Improving the generation of personalised descriptions
Thiago Castro Ferreira | Ivandré Paraboni
Proceedings of the 10th International Conference on Natural Language Generation

Referring expression generation (REG) models that use speaker-dependent information require a considerable amount of training data produced by every individual speaker, or may otherwise perform poorly. In this work we propose a simple personalised method for this task, in which speakers are grouped into profiles according to their referential behaviour. Intrinsic evaluation shows that the use of speaker’s profiles generally outperforms the personalised method found in previous work.

2016

pdf bib
Task demands and individual variation in referring expressions
Adriana Baltaretu | Thiago Castro Ferreira
Proceedings of the 9th International Natural Language Generation conference

pdf bib
Towards proper name generation: a corpus analysis
Thiago Castro Ferreira | Sander Wubben | Emiel Krahmer
Proceedings of the 9th International Natural Language Generation conference

pdf bib
Individual Variation in the Choice of Referential Form
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Towards more variation in text generation: Developing and evaluating variation models for choice of referential form
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Zoom: a corpus of natural language descriptions of map locations
Romina Altamirano | Thiago Ferreira | Ivandré Paraboni | Luciana Benotti
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)