============================================================================ 
LaTeCH-CLfL 2026 Reviews for Submission #40
============================================================================ 

Title: Narrative in Short German Prose: A Multi-Phenomenon Dataset for Computational Literary Analysis
Authors: Hans Ole Hatzel, Haimo Stiemer, Evelyn Gius and Chris Biemann


============================================================================
                            REVIEWER #1
============================================================================

---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
                   Appropriateness (1-5): 5
                           Clarity (1-5): 4
      Originality / Innovativeness (1-5): 4
           Soundness / Correctness (1-5): 5
             Meaningful Comparison (1-5): 4
                      Thoroughness (1-5): 5
        Impact of Ideas or Results (1-5): 3
                    Recommendation (1-5): 5
               Reviewer Confidence (1-5): 3

Detailed Comments
---------------------------------------------------------------------------
The paper describes a corpus of short German prose texts fully annotated with narrative phenomena and information about reading pace obtained from the audiobook versions of the texts. The corpus builds on previous annotations of the texts, existing annotation schemes, and newly developed annotation schemes. Next to describing the corpus and annotation process, the paper shows how the annotations can be used to provide insights into the characteristics of the texts. Overall, the paper is very clear and well-written. It presents a useful resource to the community working on Computational Literary Studies. 

I could not identify any reasons to reject the paper. Below, I provide some suggestions for improving the final version of the paper.

- Overall, it would be great to give some more examples. The Figures (Figure 3, Figure 1) are very helpful, but sometimes it would be nice to have short examples in the text already (e.g. in Section 3.1.2 on Semantic Verb Classes and what verb class distributions can show about a text). -> I would say this is not in scope

- The method to establish plot keyness sounds very interesting! The intuition is clear, but it would be great to have a bit more information about how it works (in particular, how is it established that summaries reference a given sentence?). -> again related work, right?
 
- If I understand correctly, Figure 4 and Figure 5 are not referenced in the text. It would be great to understand how they fit into the analysis provided in Section 4. -> should be fixed!
---------------------------------------------------------------------------



============================================================================
                            REVIEWER #2
============================================================================

---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
                   Appropriateness (1-5): 4
                           Clarity (1-5): 3
      Originality / Innovativeness (1-5): 3
           Soundness / Correctness (1-5): 4
             Meaningful Comparison (1-5): 3
                      Thoroughness (1-5): 3
        Impact of Ideas or Results (1-5): 3
                    Recommendation (1-5): 3
               Reviewer Confidence (1-5): 3

Detailed Comments
---------------------------------------------------------------------------
The paper presents GeAnProse, a corpus of four out-of-copyright German short stories annotated with 18 k+ standoff labels covering multiple narrative phenomena. The authors merge gold annotations for narrativity, semantic verb classes, and plot keyness are merged with three novel layers: (i) scene segmentation, (ii) Characters-in-Action (ChiA) annotations capturing character mentions, agency (agentive / low-agency / passive) and direct speech, and (iii) sentence-level audiobook timing (reading speed and pause length) created via forced alignment of professional and amateur readings. The resource is released in JSON, accompanied by agreement statistics and exploratory analyses relating the layers (e.g., correlation between narrativity and keyness, Figure 4 heat-map of verb classes vs. event tags). While the exploration could be deeper and the scale is limited, these don't undermine the core contribution.
---------------------------------------------------------------------------


Questions for Authors
---------------------------------------------------------------------------
Inter-annotator agreement for the low-agency and passive roles is relatively low (Î± â‰ˆ 0.35â€“0.51). Could the authors provide additional analysis (e.g., common confusion patterns or concrete examples) or discuss whether guideline refinements were considered to mitigate this? -> this would make a lot of sense actually!
---------------------------------------------------------------------------



============================================================================
                            REVIEWER #3
============================================================================

---------------------------------------------------------------------------
Reviewer's Scores
---------------------------------------------------------------------------
                   Appropriateness (1-5): 5
                           Clarity (1-5): 4
      Originality / Innovativeness (1-5): 4
           Soundness / Correctness (1-5): 4
             Meaningful Comparison (1-5): 4
                      Thoroughness (1-5): 4
        Impact of Ideas or Results (1-5): 4
                    Recommendation (1-5): 4
               Reviewer Confidence (1-5): 4

Detailed Comments
---------------------------------------------------------------------------
This paper introduces the novel GeAnProse dataset which contains extensive narrative annotations for four German short stories. I have a few suggestions for additions to or slight restructuring of the paper, but it is overall well-written, detailed, and introduces an interesting and complex new dataset.

I think the Related Work section is quite thorough, but might be strengthened by the addition of some examples of how similar literary datasets have been used in further research. In addition, I think it would be helpful to see explicitly why "Effie Briest" was replaced with "Der blonde Eckbert" in the dataset; as is, it feels mysterious!

I appreciated that in sections 3.1.1 + 3.1.3 it was specifically stated what the annotations would be useful for; I would have appreciated a similar explanation for 3.1.2. I was somewhat confused by the statement "we use text passages instead of atomic event units" in 3.1.3 since I believe work from a different paper is being described; perhaps this could be rephrased.

Overall, although the detail included in sections 3.1 and 3.3 was useful, it did feel like the paper spent a considerable amount of space describing work from previous papers. Perhaps some of this detail could be condensed and replaced with a slight expansion of Section 3.5.1 (Agreement). I think it would've been useful to see more detail on how the problems were converted to binary classification problems, what the entire range of Krippendorff's Alpha was, and more discussion of what made the tasks so difficult. It may also be beneficial swap the order of this section and 3.6 (Annotation Workflow), since that provided some useful framing on how the evaluated annotations were performed. As a minor detail, it would've also been been great to see Table 3 explicitly referenced in 3.5.1, since I didn't realize it existed till I'd gone on to the next page.

As a final minor note, in line 301 it says that three special character types are introduced, but only two are named â€“ this seems like a quick typo fix.
---------------------------------------------------------------------------
