2024
pdf
bib
EuReCo: Not Building and Yet Using Federated Comparable Corpora for Cross-Linguistic Research
Marc Kupietz
|
Piotr Banski
|
Nils Diewald
|
Beata Trawinski
|
Andreas Witt
Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) @ LREC-COLING 2024
2008
pdf
bib
abs
A Multilingual Database of Polarity Items
Beata Trawiński
|
Jan-Philipp Soehn
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
This paper presents three electronic collections of polarity items: (i) negative polarity items in Romanian, (ii) negative polarity items in German, and (iii) positive polarity items in German. The presented collections are a part of a linguistic resource on lexical units with highly idiosyncratic occurrence patterns. The motivation for collecting and documenting polarity items was to provide a solid empirical basis for linguistic investigations of these expressions. Our databe provides general information about the collected items, specifies their syntactic properties, and describes the environment that licenses a given item. For each licensing context, examples from various corpora and the Internet are introduced. Finally, the type of polarity (negative or positive) and the class (superstrong, strong, weak or open) associated with a given item is specified. Our database is encoded in XML and is available via the Internet, offering dynamic and flexible access.
2006
pdf
bib
abs
The Collection of Distributionally Idiosyncratic Items: A Multilingual Resource for Linguistic Research
Manfred Sailer
|
Beata Trawiński
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
We present two collections of lexical items with idiosyncratic distribution. The collections document the behavior of German and English bound words (BW, such as English headway), i.e., words which can only occur in one expression (make headway). BWs are a problem for both general and idiomatic dictionaries since it is unclear whether they have an independent lexical status and to what extent the expressions in which they occur are typical idiomatic expressions. We propose a system which allows us to document the information about BWs from dictionaries and linguistic literature, together with corpus data and example queries for major text corpora. We present our data structure and point to other phraseologically oriented collections. We will also show differences between the German and the English collection.
pdf
bib
A Quantitative Approach to Preposition-Pronoun Contraction in Polish
Beata Trawiński
Proceedings of the Third ACL-SIGSEM Workshop on Prepositions
2003
pdf
bib
Licensing Complex Prepositions via Lexical Constraints
Beata Trawinski
Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment