Cococorpus: a corpus of copredication

Long Chen, Deniz Ekin Yavaş, Laura Kallmeyer, Rainer Osswald


Abstract
While copredication has been widely investigated as a linguistic phenomenon, there is a notable lack of systematically annotated data to support empirical and quantitative research. This paper gives an overview of the ongoing construction of Cococorpus, a corpus of copredication, describes the annotation methodology and guidelines, and presents preliminary findings from the annotated data. Currently, the corpus contains 1500 gold-standard manual annotations including about 200 sentences with copredications. The annotated data not only supports the empirical validation for existing theories of copredication, but also reveals regularities that may inform theoretical development.
Anthology ID:
2025.isa-1.4
Volume:
Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21)
Month:
September
Year:
2025
Address:
Düsseldorf, Germany
Editor:
Bunt Harry
Venues:
ISA | WS
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
31–40
Language:
URL:
https://aclanthology.org/2025.isa-1.4/
DOI:
Bibkey:
Cite (ACL):
Long Chen, Deniz Ekin Yavaş, Laura Kallmeyer, and Rainer Osswald. 2025. Cococorpus: a corpus of copredication. In Proceedings of the 21st Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-21), pages 31–40, Düsseldorf, Germany. Association for Computational Linguistics.
Cite (Informal):
Cococorpus: a corpus of copredication (Chen et al., ISA 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.isa-1.4.pdf