Zichao Wang

2025

Active Learning (AL) has been a powerful paradigm for improving model efficiency and performance by selecting the most informative data points for labeling and training. In recent active learning frameworks, Large Language Models (LLMs) have been employed not only for selection but also for generating entirely new data instances and providing more cost-effective annotations. Motivated by the increasing importance of high-quality data and efficient model training in the era of LLMs, we present a comprehensive survey on LLM-based Active Learning. We introduce an intuitive taxonomy that categorizes these techniques and discuss the transformative roles LLMs can play in the active learning loop. We further examine the impact of AL on LLM learning paradigms and its applications across various domains. Finally, we identify open challenges and propose future research directions. This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques and deploy them to new applications.

pdf bib abs
Principled Content Selection to Generate Diverse and Personalized Multi-Document Summaries
Vishakh Padmakumar | Zichao Wang | David Arbour | Jennifer Healey
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While large language models (LLMs) are increasingly capable of handling longer contexts, recent work has demonstrated that they exhibit the _”lost in the middle”_ phenomenon (Liu et al., 2024) of unevenly attending to different parts of the provided context. This hinders their ability to cover diverse source material in multi-document summarization, as noted in the DiverseSumm benchmark (Huang et al., 2024). In this work, we contend that principled content selection is a simple way to increase source coverage on this task. As opposed to prompting an LLM to perform the summarization in a single step, we explicitly divide the task into three steps—(1) reducing document collections to atomic key points, (2) using determinantal point processes (DPP) to perform select key points that prioritize diverse content, and (3) rewriting to the final summary. By combining prompting steps, for extraction and rewriting, with principled techniques, for content selection, we consistently improve source coverage on the DiverseSumm benchmark across various LLMs. Finally, we also show that by incorporating relevance to a provided user intent into the DPP kernel, we can generate _personalized_ summaries that cover _relevant_ source information while retaining coverage.

pdf bib abs
Persona-SQ: A Personalized Suggested Question Generation Framework For Real-world Documents
Zihao Lin | Zichao Wang | Yuanting Pan | Varun Manjunatha | Ryan A. Rossi | Angela Lau | Lifu Huang | Tong Sun
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)

Suggested questions (SQs) provide an effective initial interface for users to engage with their documents in AI-powered reading applications. In practical reading sessions, users have diverse backgrounds and reading goals, yet current SQ features typically ignore such user information, resulting in homogeneous or ineffective questions. We introduce a pipeline that generates personalized SQs by incorporating reader profiles (professions and reading goals) and demonstrate its utility in two ways: 1) as an improved SQ generation pipeline that produces higher quality and more diverse questions compared to current baselines, and 2) as a data generator to fine-tune extremely small models that perform competitively with much larger models on SQ generation. Our approach can not only serve as a drop-in replacement in current SQ systems to immediately improve their performance but also help develop on-device SQ models that can run locally to deliver fast and private SQ experience.

2024

pdf bib abs
ATLAS: A System for PDF-centric Human Interaction Data Collection
Alexa Siu | Zichao Wang | Joshua Hoeflich | Naman Kapasi | Ani Nenkova | Tong Sun
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: System Demonstrations)

The Portable Document Format (PDF) is a popular format for distributing digital documents. Datasets on PDF reading behaviors and interactions remain limited due to the challenges of instrumenting PDF readers for these data collection tasks. We present ATLAS, a data collection tool designed to better support researchers in collecting rich PDF-centric datasets from users. ATLAS supports researchers in programmatically creating a user interface for data collection that is ready to share with annotators. It includes a toolkit and an extensible schema to easily customize the data collection tasks for a variety of purposes, allowing collection of PDF annotations (e.g., highlights, drawings) as well as reading behavior analytics (e.g., page scroll, text selections). We open-source ATLAS1 to support future research efforts and review use cases of ATLAS that showcase our system’s broad applicability.

2023

pdf bib abs
Interpretable Math Word Problem Solution Generation via Step-by-step Planning
Mengxue Zhang | Zichao Wang | Zhichao Yang | Weiqi Feng | Andrew Lan
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Solutions to math word problems (MWPs) with step-by-step explanations are valuable, especially in education, to help students better comprehend problem-solving strategies. Most existing approaches only focus on obtaining the final correct answer. A few recent approaches leverage intermediate solution steps to improve final answer correctness but often cannot generate coherent steps with a clear solution strategy. Contrary to existing work, we focus on improving the correctness and coherence of the intermediate solutions steps. We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps. Our approach first plans the next step by predicting the necessary math operation needed to proceed, given history steps, then generates the next step, token-by-token, by prompting a language model with the predicted math operation. Experiments on the GSM8K dataset demonstrate that our approach improves the accuracy and interpretability of the solution on both automatic metrics and human evaluation.

pdf bib abs
Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank
Nischal Ashok Kumar | Nigel Fernandez | Zichao Wang | Andrew Lan
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Reading comprehension is a crucial skill in many aspects of education, including language learning, cognitive development, and fostering early literacy skills in children. Automated answer-aware reading comprehension question generation has significant potential to scale up learner support in educational activities. One key technical challenge in this setting is that there can be multiple questions, sometimes very different from each other, with the same answer; a trained question generation method may not necessarily know which question human educators would prefer. To address this challenge, we propose 1) a data augmentation method that enriches the training dataset with diverse questions given the same context and answer and 2) an overgenerate-and-rank method to select the best question from a pool of candidates. We evaluate our method on the FairytaleQA dataset, showing a 5% absolute improvement in ROUGE-L over the best existing method. We also demonstrate the effectiveness of our method in generating harder, “implicit” questions, where the answers are not contained in the context as text spans.

pdf bib abs
MultiQG-TI: Towards Question Generation from Multi-modal Sources
Zichao Wang | Richard Baraniuk
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

We study the new problem of automatic question generation (QG) from multi-modal sources containing images and texts, significantly expanding the scope of most of the existing work that focuses exclusively on QG from only textual sources. We propose a simple solution for our new problem, called MultiQG-TI, which enables a text-only question generator to process visual input in addition to textual input. Specifically, we leverage an image-to-text model and an optical character recognition model to obtain the textual description of the image and extract any texts in the image, respectively, and then feed them together with the input texts to the question generator. We only fine-tune the question generator while keeping the other components fixed. On the challenging ScienceQA dataset, we demonstrate that MultiQG-TI significantly outperforms ChatGPT with few-shot prompting, despite having hundred-times less trainable parameters. Additional analyses empirically confirm the necessity of both visual and textual signals for QG and show the impact of various modeling choices. Code is available at https://anonymous.4open.science/r/multimodal-QG-47F2/

pdf bib
MANER: Mask Augmented Named Entity Recognition for Extreme Low-Resource Languages
Shashank Sonkar | Zichao Wang | Richard Baraniuk
Proceedings of the Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)

2022

pdf bib abs
Open-ended Knowledge Tracing for Computer Science Education
Naiming Liu | Zichao Wang | Richard Baraniuk | Andrew Lan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

In educational applications, knowledge tracing refers to the problem of estimating students’ time-varying concept/skill mastery level from their past responses to questions and predicting their future performance.One key limitation of most existing knowledge tracing methods is that they treat student responses to questions as binary-valued, i.e., whether they are correct or incorrect. Response correctness analysis/prediction is straightforward, but it ignores important information regarding mastery, especially for open-ended questions.In contrast, exact student responses can provide much more information.In this paper, we conduct the first exploration int open-ended knowledge tracing (OKT) by studying the new task of predicting students’ exact open-ended responses to questions.Our work is grounded in the domain of computer science education with programming questions. We develop an initial solution to the OKT problem, a student knowledge-guided code generation approach, that combines program synthesis methods using language models with student knowledge tracing methods. We also conduct a series of quantitative and qualitative experiments on a real-world student code dataset to validate and demonstrate the promise of OKT.

2021

pdf bib abs
Math Word Problem Generation with Mathematical Consistency and Problem Context Constraints
Zichao Wang | Andrew Lan | Richard Baraniuk
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

We study the problem of generating arithmetic math word problems (MWPs) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario. Existing approaches are prone to generating MWPs that are either mathematically invalid or have unsatisfactory language quality. They also either ignore the context or require manual specification of a problem template, which compromises the diversity of the generated MWPs. In this paper, we develop a novel MWP generation approach that leverages i) pre-trained language models and a context keyword selection model to improve the language quality of generated MWPs and ii) an equation consistency constraint for math equations to improve the mathematical validity of the generated MWPs. Extensive quantitative and qualitative experiments on three real-world MWP datasets demonstrate the superior performance of our approach compared to various baselines.

Co-authors

Venues

acl3
bea2
emnlp2
naacl2
sustainlp1
show all...

ws1

Fix author