PTPARL-V: Portuguese Parliamentary Debates for Voting Behaviour Study

Afonso Sousa, Henrique Lopes Cardoso


Abstract
We present a new dataset, , that provides valuable insight for advancing discourse analysis of parliamentary debates in Portuguese. This is achieved by processing the open-access information available at the official Portuguese Parliament website and scraping the information from the debate minutes’ PDFs contained therein. Our dataset includes interventions from 547 different deputies of all major Portuguese parties, from 736 legislative initiatives spanning five legislatures from 2005 to 2021. We present a statistical analysis of the dataset compared to other publicly available Portuguese parliamentary debate corpora. Finally, we provide baseline performance analysis for voting behaviour classification.
Anthology ID:
2024.parlaclarin-1.6
Volume:
Proceedings of the IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora (ParlaCLARIN) @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Darja Fiser, Maria Eskevich, David Bordon
Venues:
ParlaCLARIN | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
38–42
Language:
URL:
https://aclanthology.org/2024.parlaclarin-1.6
DOI:
Bibkey:
Cite (ACL):
Afonso Sousa and Henrique Lopes Cardoso. 2024. PTPARL-V: Portuguese Parliamentary Debates for Voting Behaviour Study. In Proceedings of the IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora (ParlaCLARIN) @ LREC-COLING 2024, pages 38–42, Torino, Italia. ELRA and ICCL.
Cite (Informal):
PTPARL-V: Portuguese Parliamentary Debates for Voting Behaviour Study (Sousa & Lopes Cardoso, ParlaCLARIN-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.parlaclarin-1.6.pdf