Open-ended Knowledge Tracing for Computer Science Education
Naiming Liu | Zichao Wang | Richard Baraniuk | Andrew Lan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
In educational applications, knowledge tracing refers to the problem of estimating students’ time-varying concept/skill mastery level from their past responses to questions and predicting their future performance.One key limitation of most existing knowledge tracing methods is that they treat student responses to questions as binary-valued, i.e., whether they are correct or incorrect. Response correctness analysis/prediction is straightforward, but it ignores important information regarding mastery, especially for open-ended questions.In contrast, exact student responses can provide much more information.In this paper, we conduct the first exploration int open-ended knowledge tracing (OKT) by studying the new task of predicting students’ exact open-ended responses to questions.Our work is grounded in the domain of computer science education with programming questions. We develop an initial solution to the OKT problem, a student knowledge-guided code generation approach, that combines program synthesis methods using language models with student knowledge tracing methods. We also conduct a series of quantitative and qualitative experiments on a real-world student code dataset to validate and demonstrate the promise of OKT.
Math Word Problem Generation with Mathematical Consistency and Problem Context Constraints
Zichao Wang | Andrew Lan | Richard Baraniuk
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
We study the problem of generating arithmetic math word problems (MWPs) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario. Existing approaches are prone to generating MWPs that are either mathematically invalid or have unsatisfactory language quality. They also either ignore the context or require manual specification of a problem template, which compromises the diversity of the generated MWPs. In this paper, we develop a novel MWP generation approach that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of generated MWPs and ii) an equation consistency constraint for math equations to improve the mathematical validity of the generated MWPs. Extensive quantitative and qualitative experiments on three real-world MWP datasets demonstrate the superior performance of our approach compared to various baselines.