Rui Pereira


pdf bib
Compiling and Exploring a Portuguese Parliamentary Corpus: ParlaMint-PT
José Aires | Aida Cardoso | Rui Pereira | Amalia Mendes
Proceedings of the IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora (ParlaCLARIN) @ LREC-COLING 2024

As part of the project ParlaMint II, a new corpus of the sessions of the Portuguese Parliament from 2015 to 2022 has been compiled, encoded and annotated following the ParlaMint guidelines. We report on the contents of the corpus and on the specific nature of the political settings in Portugal during the time period covered. Two subcorpora were designed that would enable comparisons of the political speeches between pre and post covid-19 pandemic. We discuss the pipeline applied to download the original texts, ensure their preprocessing and encoding in XML, and the final step of annotation. This new resource covers a period of changes in the political system in Portugal and will be an important source of data for political and social studies. Finally, Finally, we have explored the political stance on immigration in the ParlaMint-PT corpus.