João Campos


2024

pdf bib
Imaginary Numbers! Evaluating Numerical Referring Expressions by Neural End-to-End Surface Realization Systems
Rossana Cunha | Osuji Chinonso | João Campos | Brian Timoney | Brian Davis | Fabio Cozman | Adriana Pagano | Thiago Castro Ferreira
Proceedings of the Fifth Workshop on Insights from Negative Results in NLP

Neural end-to-end surface realizers output more fluent texts than classical architectures. However, they tend to suffer from adequacy problems, in particular hallucinations in numerical referring expression generation. This poses a problem to language generation in sensitive domains, as is the case of robot journalism covering COVID-19 and Amazon deforestation. We propose an approach whereby numerical referring expressions are converted from digits to plain word form descriptions prior to being fed to state-of-the-art Large Language Models. We conduct automatic and human evaluations to report the best strategy to numerical superficial realization. Code and data are publicly available.

2022

pdf bib
BLAB Reporter: Automated journalism covering the Blue Amazon
Yan Sym | João Campos | Fabio Cozman
Proceedings of the 15th International Conference on Natural Language Generation: System Demonstrations

This demo paper introduces BLAB reporter, a robot-journalist system covering the Brazilian Blue Amazon. The application is based on a pipeline architecture for Natural Language Generation, which offers daily reports, news summaries and curious facts in Brazilian Portuguese. By collecting, storing and analysing structured data from publicly available sources, the robot-journalist uses domain knowledge to generate, validate and publish texts in Twitter. Code and corpus are publicly available.

2020

pdf bib
DaMata: A Robot-Journalist Covering the Brazilian Amazon Deforestation
André Luiz Rosa Teixeira | João Campos | Rossana Cunha | Thiago Castro Ferreira | Adriana Pagano | Fabio Cozman
Proceedings of the 13th International Conference on Natural Language Generation

This demo paper introduces DaMata, a robot-journalist covering deforestation in the Brazilian Amazon. The robot-journalist is based on a pipeline architecture of Natural Language Generation, which yields multilingual daily and monthly reports based on the public data provided by DETER, a real-time deforestation satellite monitor developed and maintained by the Brazilian National Institute for Space Research (INPE). DaMata automatically generates reports in Brazilian Portuguese and English and publishes them on the Twitter platform. Corpus and code are publicly available.