A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions

Amalia Todiraşcu, Dan Tufiş, Ulrich Heid, Christopher Gledhill, Dan Ştefanescu, Marion Weller, François Rousselot


Abstract
We present the main findings and preliminary results of an ongoing project aimed at developing a system for collocation extraction based on contextual morpho-syntactic properties. We explored two hybrid extraction methods: the first method applies language-indepedent statistical techniques followed by a linguistic filtering, while the second approach, available only for German, is based on a set of lexico-syntactic patterns to extract collocation candidates. To define extraction and filtering patterns, we studied a specific collocation category, the Verb-Noun constructions, using a model inspired by the systemic functional grammar, proposing three level analysis: lexical, functional and semantic criteria. From tagged and lemmatized corpus, we identify some contextual morpho-syntactic properties helping to filter the output of the statistical methods and to extract some potential interesting VN constructions (complex predicates vs complex predicators). The extracted candidates are validated and classified manually.
Anthology ID:
L08-1448
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/500_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Amalia Todiraşcu, Dan Tufiş, Ulrich Heid, Christopher Gledhill, Dan Ştefanescu, Marion Weller, and François Rousselot. 2008. A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions (Todiraşcu et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/500_paper.pdf