Evelina Fedorenko


2022

pdf bib
SentSpace: Large-Scale Benchmarking and Evaluation of Text using Cognitively Motivated Lexical, Syntactic, and Semantic Features
Greta Tuckute | Aalok Sathe | Mingye Wang | Harley Yoder | Cory Shain | Evelina Fedorenko
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations

SentSpace is a modular framework for streamlined evaluation of text. SentSpacecharacterizes textual input using diverse lexical, syntactic, and semantic features derivedfrom corpora and psycholinguistic experiments. Core sentence features fall into three primaryfeature spaces: 1) Lexical, 2) Contextual, and 3) Embeddings. To aid in the analysis of computed features, SentSpace provides a web interface for interactive visualization and comparison with text from large corpora. The modular design of SentSpace allows researchersto easily integrate their own feature computation into the pipeline while benefiting from acommon framework for evaluation and visualization. In this manuscript we will describe thedesign of SentSpace, its core feature spaces, and demonstrate an example use case by comparing human-written and machine-generated (GPT2-XL) sentences to each other. We findthat while GPT2-XL-generated text appears fluent at the surface level, psycholinguistic normsand measures of syntactic processing reveal key differences between text produced by humansand machines. Thus, SentSpace provides a broad set of cognitively motivated linguisticfeatures for evaluation of text within natural language processing, cognitive science, as wellas the social sciences.

2019

pdf bib
Syntactic dependencies correspond to word pairs with high mutual information
Richard Futrell | Peng Qian | Edward Gibson | Evelina Fedorenko | Idan Blank
Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019)

2018

pdf bib
The Natural Stories Corpus
Richard Futrell | Edward Gibson | Harry J. Tily | Idan Blank | Anastasia Vishnevetsky | Steven Piantadosi | Evelina Fedorenko
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)