2024
pdf
bib
abs
Automated Essay Scoring: A Reflection on the State of the Art
Shengjie Li
|
Vincent Ng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
While steady progress has been made on the task of automated essay scoring (AES) in the past decade, much of the recent work in this area has focused on developing models that beat existing models on a standard evaluation dataset. While improving performance numbers remains an important goal in the short term, such a focus is not necessarily beneficial for the long-term development of the field. We reflect on the state of the art in AES research, discussing issues that we believe can encourage researchers to think bigger than improving performance numbers with the ultimate goal of triggering discussion among AES researchers on how we should move forward.
pdf
bib
abs
ICLE++: Modeling Fine-Grained Traits for Holistic Essay Scoring
Shengjie Li
|
Vincent Ng
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
The majority of the recently developed models for automated essay scoring (AES) are evaluated solely on the ASAP corpus. However, ASAP is not without its limitations. For instance, it is not clear whether models trained on ASAP can generalize well when evaluated on other corpora. In light of these limitations, we introduce ICLE++, a corpus of persuasive student essays annotated with both holistic scores and trait-specific scores. Not only can ICLE++ be used to test the generalizability of AES models trained on ASAP, but it can also facilitate the evaluation of models developed for newer AES problems such as multi-trait scoring and cross-prompt scoring. We believe that ICLE++, which represents a culmination of our long-term effort in annotating the essays in the ICLE corpus, contributes to the set of much-needed annotated corpora for AES research.
pdf
bib
abs
Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art
Shengjie Li
|
Vincent Ng
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Cross-prompt automated essay scoring (AES), an under-investigated but challenging task that has gained increasing popularity in the AES community, aims to train an AES system that can generalize well to prompts that are unseen during model training. While recently-developed cross-prompt AES models have combined essay representations that are learned via sophisticated neural architectures with so-called prompt-independent features, an intriguing question is: are complex neural models needed to achieve state-of-the-art results? We answer this question by abandoning sophisticated neural architectures and developing a purely feature-based approach to cross-prompt AES that adopts a simple neural architecture. Experiments on the ASAP dataset demonstrate that our simple approach to cross-prompt AES can achieve state-of-the-art results.
2022
pdf
bib
abs
End-to-End Neural Discourse Deixis Resolution in Dialogue
Shengjie Li
|
Vincent Ng
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
We adapt Lee et al.’s (2018) span-based entity coreference model to the task of end-to-end discourse deixis resolution in dialogue, specifically by proposing extensions to their model that exploit task-specific characteristics. The resulting model, dd-utt, achieves state-of-the-art results on the four datasets in the CODI-CRAC 2021 shared task.
pdf
bib
abs
Neural Anaphora Resolution in Dialogue Revisited
Shengjie Li
|
Hideo Kobayashi
|
Vincent Ng
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
We present the systems that we developed for all three tracks of the CODI-CRAC 2022 shared task, namely the anaphora resolution track, the bridging resolution track, and the discourse deixis resolution track. Combining an effective encoding of the input using the SpanBERTLarge encoder with an extensive hyperparameter search process, our systems achieved the highest scores in all phases of all three tracks.
pdf
bib
abs
Summarizing Dialogues with Negative Cues
Junpeng Liu
|
Yanyan Zou
|
Yuxuan Xi
|
Shengjie Li
|
Mian Ma
|
Zhuoye Ding
Proceedings of the 29th International Conference on Computational Linguistics
Abstractive dialogue summarization aims to convert a long dialogue content into its short form where the salient information is preserved while the redundant pieces are ignored. Different from the well-structured text, such as news and scientific articles, dialogues often consist of utterances coming from two or more interlocutors, where the conversations are often informal, verbose, and repetitive, sprinkled with false-starts, backchanneling, reconfirmations, hesitations, speaker interruptions and the salient information is often scattered across the whole chat. The above properties of conversations make it difficult to directly concentrate on scattered outstanding utterances and thus present new challenges of summarizing dialogues. In this work, rather than directly forcing a summarization system to merely pay more attention to the salient pieces, we propose to explicitly have the model perceive the redundant parts of an input dialogue history during the training phase. To be specific, we design two strategies to construct examples without salient pieces as negative cues. Then, the sequence-to-sequence likelihood loss is cooperated with the unlikelihood objective to drive the model to focus less on the unimportant information and also pay more attention to the salient pieces. Extensive experiments on the benchmark dataset demonstrate that our simple method significantly outperforms the baselines with regard to both semantic matching and factual consistent based metrics. The human evaluation also proves the performance gains.
2021
pdf
bib
abs
Neural Anaphora Resolution in Dialogue
Hideo Kobayashi
|
Shengjie Li
|
Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
We describe the systems that we developed for the three tracks of the CODI-CRAC 2021 shared task, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. Our team ranked second for entity coreference resolution, first for bridging resolution, and first for discourse deixis resolution.
pdf
bib
abs
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis Resolution in Dialogue: A Cross-Team Analysis
Shengjie Li
|
Hideo Kobayashi
|
Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
The CODI-CRAC 2021 shared task is the first shared task that focuses exclusively on anaphora resolution in dialogue and provides three tracks, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. We perform a cross-task analysis of the systems that participated in the shared task in each of these tracks.
2020
pdf
bib
abs
Cross-modal Coherence Modeling for Caption Generation
Malihe Alikhani
|
Piyush Sharma
|
Shengjie Li
|
Radu Soricut
|
Matthew Stone
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning. Using an annotation protocol specifically devised for capturing image–caption coherence relations, we annotate 10,000 instances from publicly-available image–caption pairs. We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models. The results show a dramatic improvement in the consistency and quality of the generated captions with respect to information needs specified via coherence relations.