2023
pdf
bib
abs
TabPrompt: Graph-based Pre-training and Prompting for Few-shot Table Understanding
Rihui Jin
|
Jianan Wang
|
Wei Tan
|
Yongrui Chen
|
Guilin Qi
|
Wang Hao
Findings of the Association for Computational Linguistics: EMNLP 2023
Table Understanding (TU) is a crucial aspect of information extraction that enables machines to comprehend the semantics behind tabular data. However, existing methods of TU cannot deal with the scarcity of labeled tabular data. In addition, these methods primarily focus on the textual content within the table, disregarding the inherent topological information of the table. This can lead to a misunderstanding of the tabular semantics. In this paper, we propose TabPrompt, a new framework to tackle the above challenges. Prompt-based learning has gained popularity due to its exceptional performance in few-shot learning. Thus, we introduce prompt-based learning to handle few-shot TU. Furthermore, Graph Contrastive Learning (Graph CL) demonstrates remarkable capabilities in capturing topological information, making Graph Neural Networks an ideal method for encoding tables. Hence, we develop a novel Graph CL method tailored to tabular data. This method serves as the pretext task during the pre-training phase, allowing the generation of vector representations that incorporate the table’s topological information. The experimental results of outperforming all strong baselines demonstrate the strength of our method in few-shot table understanding tasks.
pdf
bib
abs
Re-weighting Tokens: A Simple and Effective Active Learning Strategy for Named Entity Recognition
Haocheng Luo
|
Wei Tan
|
Ngoc Nguyen
|
Lan Du
Findings of the Association for Computational Linguistics: EMNLP 2023
Active learning, a widely adopted technique for enhancing machine learning models in text and image classification tasks with limited annotation resources, has received relatively little attention in the domain of Named Entity Recognition (NER). The challenge of data imbalance in NER has hindered the effectiveness of active learning, as sequence labellers lack sufficient learning signals. To address these challenges, this paper presents a novel re-weighting-based active learning strategy that assigns dynamic smoothing weights to individual tokens. This adaptable strategy is compatible with various token-level acquisition functions and contributes to the development of robust active learners. Experimental results on multiple corpora demonstrate the substantial performance improvement achieved by incorporating our re-weighting strategy into existing acquisition functions, validating its practical efficacy. We will release our implementation upon the publication of this paper.
2022
pdf
bib
abs
ECO v1: Towards Event-Centric Opinion Mining
Ruoxi Xu
|
Hongyu Lin
|
Meng Liao
|
Xianpei Han
|
Jin Xu
|
Wei Tan
|
Yingfei Sun
|
Le Sun
Findings of the Association for Computational Linguistics: ACL 2022
Events are considered as the fundamental building blocks of the world. Mining event-centric opinions can benefit decision making, people communication, and social good. Unfortunately, there is little literature addressing event-centric opinion mining, although which significantly diverges from the well-studied entity-centric opinion mining in connotation, structure, and expression. In this paper, we propose and formulate the task of event-centric opinion mining based on event-argument structure and expression categorizing theory. We also benchmark this task by constructing a pioneer corpus and designing a two-step benchmark framework. Experiment results show that event-centric opinion mining is feasible and challenging, and the proposed task, dataset, and baselines are beneficial for future studies.
2019
pdf
bib
A Real-World Human-Machine Interaction Platform in Insurance Industry
Wei Tan
|
Chia-Hao Chang
|
Yang Mo
|
Lian-Xin Jiang
|
Gen Li
|
Xiao-Long Hou
|
Chu Chen
|
Yu-Sheng Huang
|
Meng-Yuan Huang
|
Jian-Ping Shen
Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019)