2020
pdf
bib
abs
“Voices of the Great War”: A Richly Annotated Corpus of Italian Texts on the First World War
Federico Boschetti
|
Irene De Felice
|
Stefano Dei Rossi
|
Felice Dell’Orletta
|
Michele Di Giorgio
|
Martina Miliani
|
Lucia C. Passaro
|
Angelica Puddu
|
Giulia Venturi
|
Nicola Labanca
|
Alessandro Lenci
|
Simonetta Montemagni
Proceedings of the Twelfth Language Resources and Evaluation Conference
“Voices of the Great War” is the first large corpus of Italian historical texts dating back to the period of First World War. This corpus differs from other existing resources in several respects. First, from the linguistic point of view it gives account of the wide range of varieties in which Italian was articulated in that period, namely from a diastratic (educated vs. uneducated writers), diaphasic (low/informal vs. high/formal registers) and diatopic (regional varieties, dialects) points of view. From the historical perspective, through a collection of texts belonging to different genres it represents different views on the war and the various styles of narrating war events and experiences. The final corpus is balanced along various dimensions, corresponding to the textual genre, the language variety used, the author type and the typology of conveyed contents. The corpus is fully annotated with lemmas, part-of-speech, terminology, and named entities. Significant corpus samples representative of the different “voices” have also been enriched with meta-linguistic and syntactic information. The layer of syntactic annotation forms the first nucleus of an Italian historical treebank complying with the Universal Dependencies standard. The paper illustrates the final resource, the methodology and tools used to build it, and the Web Interface for navigating it.
2010
pdf
bib
TANL-1: Coreference Resolution by Parse Analysis and Similarity Clustering
Giuseppe Attardi
|
Maria Simi
|
Stefano Dei Rossi
Proceedings of the 5th International Workshop on Semantic Evaluation
pdf
bib
abs
A Resource and Tool for Super-sense Tagging of Italian Texts
Giuseppe Attardi
|
Stefano Dei Rossi
|
Giulia Di Pietro
|
Alessandro Lenci
|
Simonetta Montemagni
|
Maria Simi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
A SuperSense Tagger is a tool for the automatic analysis of texts that associates to each noun, verb, adjective and adverb a semantic category within a general taxonomy. The developed tagger, based on a statistical model (Maximum Entropy), required the creation of an Italian annotated corpus, to be used as a training set, and the improvement of various existing tools. The obtained results significantly improved the current state-of-the art for this particular task.