Hanlin Wang
2024
E2CL: Exploration-based Error Correction Learning for Embodied Agents
Hanlin Wang
|
Chak Tou Leong
|
Jian Wang
|
Wenjie Li
Findings of the Association for Computational Linguistics: EMNLP 2024
Language models are exhibiting increasing capability in knowledge utilization and reasoning. However, when applied as agents in embodied environments, they often suffer from misalignment between their intrinsic knowledge and environmental knowledge, leading to infeasible actions. Traditional environment alignment methods, such as supervised learning on expert trajectories and reinforcement learning, encounter limitations in covering environmental knowledge and achieving efficient convergence, respectively. Inspired by human learning, we propose Exploration-based Error Correction Learning (E2CL), a novel framework that leverages exploration-induced errors and environmental feedback to enhance environment alignment for embodied agents. E2CL incorporates teacher-guided and teacher-free explorations to gather environmental feedback and correct erroneous actions. The agent learns to provide feedback and self-correct, thereby enhancing its adaptability to target environments. Extensive experiments in the VirtualHome environment demonstrate that E2CL-trained agents outperform those trained by baseline methods and exhibit superior self-correction capabilities.
Zero-shot Event Detection Using a Textual Entailment Model as an Enhanced Annotator
Ziqian Zeng
|
Runyu Wu
|
Yuxiang Xiao
|
Xiaoda Zhong
|
Hanlin Wang
|
Zhengdong Lu
|
Huiping Zhuang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Zero-shot event detection is a challenging task. Recent research work proposed to use a pre-trained textual entailment (TE) model on this task. However, those methods treated the TE model as a frozen annotator. We treat the TE model as an annotator that can be enhanced. We propose to use TE models to annotate large-scale unlabeled text and use annotated data to finetune the TE model, yielding an improved TE model. Finally, the improved TE model is used for inference on the test set. To improve the efficiency, we propose to use keywords to filter out sentences with a low probability of expressing event(s). To improve the coverage of keywords, we expand limited number of seed keywords using WordNet, so that we can use the TE model to annotate unlabeled text efficiently. The experimental results show that our method can outperform other baselines by 15% on the ACE05 dataset.
Search
Fix data
Co-authors
- Chak Tou Leong 1
- Wenjie Li 1
- Zhengdong Lu 1
- Jian Wang (王剑) 1
- Runyu Wu 1
- show all...