A French Corpus of Québec’s Parliamentary Debates

Pierre André Ménard, Desislava Aleksandrova


Abstract
Parliamentary debates offer a window on political stances as well as a repository of linguistic and semantic knowledge. They provide insights and reasons for laws and regulations that impact electors in their everyday life. One such resource is the transcribed debates available online from the Assemblée Nationale du Québec (ANQ). This paper describes the effort to convert the online ANQ debates from various HTML formats into a standardized ParlaMint TEI annotated corpus and to enrich it with annotations extracted from related unstructured members and political parties list. The resulting resource includes 88 years of debates over a span of 114 years with more than 33.3 billion words. The addition of linguistic annotations is detailed as well as a quantitative analysis of part-of-speech tags and distribution of utterances across the corpus.
Anthology ID:
2022.parlaclarin-1.4
Volume:
Proceedings of the Workshop ParlaCLARIN III within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
ParlaCLARIN
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
25–32
Language:
URL:
https://aclanthology.org/2022.parlaclarin-1.4
DOI:
Bibkey:
Cite (ACL):
Pierre André Ménard and Desislava Aleksandrova. 2022. A French Corpus of Québec’s Parliamentary Debates. In Proceedings of the Workshop ParlaCLARIN III within the 13th Language Resources and Evaluation Conference, pages 25–32, Marseille, France. European Language Resources Association.
Cite (Informal):
A French Corpus of Québec’s Parliamentary Debates (Ménard & Aleksandrova, ParlaCLARIN 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.parlaclarin-1.4.pdf
Data
Universal Dependencies