Proceedings of the 9th Web as Corpus Workshop (WaC-9)

Felix Bildhauer, Roland Schäfer (Editors)


Anthology ID:
W14-04
Month:
April
Year:
2014
Address:
Gothenburg, Sweden
Venues:
WAC | WS
SIG:
SIGWAC
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/W14-04
DOI:
10.3115/v1/W14-04
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://aclanthology.org/W14-04.pdf

pdf bib
Proceedings of the 9th Web as Corpus Workshop (WaC-9)
Felix Bildhauer | Roland Schäfer

pdf bib
Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources
Adrien Barbaresi

pdf bib
Focused Web Corpus Crawling
Roland Schäfer | Adrien Barbaresi | Felix Bildhauer

pdf bib
Less Destructive Cleaning of Web Documents by Using Standoff Annotation
Maik Stührenberg

pdf bib
Some Issues on the Normalization of a Corpus of Products Reviews in Portuguese
Magali Sanches Duran | Lucas Avanço | Sandra Aluísio | Thiago Pardo | Maria da Graça Volpe Nunes

pdf bib
{bs,hr,sr}WaC - Web Corpora of Bosnian, Croatian and Serbian
Nikola Ljubešić | Filip Klubička

pdf bib
The PAISÀ Corpus of Italian Web Texts
Verena Lyding | Egon Stemle | Claudia Borghetti | Marco Brunello | Sara Castagnoli | Felice Dell’Orletta | Henrik Dittmann | Alessandro Lenci | Vito Pirrelli