2024
pdf
bib
abs
Multimodal Table Understanding
Mingyu Zheng
|
Xinwei Feng
|
Qingyi Si
|
Qiaoqiao She
|
Zheng Lin
|
Wenbin Jiang
|
Weiping Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Although great progress has been made by previous table understanding methods including recent approaches based on large language models (LLMs), they rely heavily on the premise that given tables must be converted into a certain text sequence (such as Markdown or HTML) to serve as model input. However, it is difficult to access such high-quality textual table representations in some real-world scenarios, and table images are much more accessible. Therefore, how to directly understand tables using intuitive visual information is a crucial and urgent challenge for developing more practical applications. In this paper, we propose a new problem, multimodal table understanding, where the model needs to generate correct responses to various table-related requests based on the given table image. To facilitate both the model training and evaluation, we construct a large-scale dataset named MMTab, which covers a wide spectrum of table images, instructions and tasks. On this basis, we develop Table-LLaVA, a generalist tabular multimodal large language model (MLLM), which significantly outperforms recent open-source MLLM baselines on 23 benchmarks under held-in and held-out settings.
pdf
bib
abs
QDMR-based Planning-and-Solving Prompting for Complex Reasoning Tasks
Jinfeng Huang
|
Qiaoqiao She
|
Wenbin Jiang
|
Hua Wu
|
Yang Hao
|
Tong Xu
|
Feng Wu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Chain-of-Thought prompting has improved reasoning capability of large language models (LLM). However, it still is challenging to guarantee the effectiveness and stability for questions requiring complicated reasoning. Recently, Plan-and-Solve prompting enhances the reasoning capability for complex questions by planning the solution steps firstly and then solving them step by step, but it suffers the difficulty to represent and execute the problem-solving logic of complex questions. To deal with these challenges, in this work, we propose a novel Plan-and-Solve prompting method based on Question Decomposition Meaning Representation (QDMR). Specifically, this method first allows the LLM to generate a QDMR graph to represent the problem-solving logic, which is a directed acyclic graph composed of sub-questions. Then, the LLM generates a specific solving process based on the QDMR graph. When solving each sub-question, it can locate the preceding sub-questions and their answers according to the QDMR graph, and then utilize this information for solution. Compared with existing Plan-and-Solve prompting techniques, our method can not only represent the problem-solving logic of complicated questions more accurately with the aid of QDMR graph, but also deliver the dependence information accurately for different solution steps according to the QDMR graph. In addition, with the supervised fine-tuning on the Allen Institute dataset, the decomposing capability of LLM for complicated questions can be considerably enhanced. Extensive experiments show that our method has achieve a great significance in arithmetic reasoning and commonsense reasoning task by comparing the classical Chain-of-Thought prompting and Plan-and-Solve prompting techniques, and the improvements achieved are even greater for problems with more reasoning steps.
2023
pdf
bib
abs
IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures
Mingyu Zheng
|
Yang Hao
|
Wenbin Jiang
|
Zheng Lin
|
Yajuan Lyu
|
QiaoQiao She
|
Weiping Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Various datasets have been proposed to promote the development of Table Question Answering (TQA) technique. However, the problem setting of existing TQA benchmarks suffers from two limitations. First, they directly provide models with explicit table structures where row headers and column headers of the table are explicitly annotated and treated as model input during inference. Second, they only consider tables of limited types and ignore other tables especially complex tables with flexible header locations. Such simplified problem setting cannot cover practical scenarios where models need to process tables without header annotations in the inference phase or tables of different types. To address above issues, we construct a new TQA dataset with implicit and multi-type table structures, named IM-TQA, which not only requires the model to understand tables without directly available header annotations but also to handle multi-type tables including previously neglected complex tables. We investigate the performance of recent methods on our dataset and find that existing methods struggle in processing implicit and multi-type table structures. Correspondingly, we propose an RGCN-RCI framework outperforming recent baselines. We will release our dataset to facilitate future research.
pdf
bib
abs
Chain-of-Thought Reasoning in Tabular Language Models
Mingyu Zheng
|
Hao Yang
|
Wenbin Jiang
|
Zheng Lin
|
Yajuan Lyu
|
Qiaoqiao She
|
Weiping Wang
Findings of the Association for Computational Linguistics: EMNLP 2023
Tabular mathematical reasoning task requires models to perform multi-step operations including information look-up and numerical calculation, based on heterogeneous data from tables and questions. Existing solutions tend to extend chain-of-thought (CoT) reasoning into powerful large language models (LLMs) to promote multi-hop mathematical reasoning. However, such LLM-based approaches are not a viable solution in the scenario of privatization deployment or limited resources. To address this problem, we revisit small-scale tabular language models (TaLMs) and extend chain-of-thought reasoning into TaLMs for the first time. Specifically, we propose a novel framework, TaCo, which coordinates two TaLMs responsible for CoT generation and answer inference, respectively. Besides, our framework can be combined with an external calculator to enhance accurate numerical calculation. On the TABMWP dataset, TaCo outperforms the state-of-the-art ChatGPT by 9.55% (82.60%→92.15% in accuracy) with much less parameters (0.8B). The code will be released along with the paper.
pdf
bib
abs
Retrieval-Augmented Domain Adaptation of Language Models
Benfeng Xu
|
Chunxu Zhao
|
Wenbin Jiang
|
PengFei Zhu
|
Songtai Dai
|
Chao Pang
|
Zhuo Sun
|
Shuohuan Wang
|
Yu Sun
Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023)
Language models pretrained on general domain corpora usually exhibit considerable degradation when generalizing to downstream tasks of specialized domains. Existing approaches try to construct PLMs for each specific domains either from scratch or through further pretraining, which not only costs substantial resources, but also fails to cover all target domains at various granularity. In this work, we propose RADA, a novel Retrieval-Augmented framework for Domain Adaptation. We first construct a textual corpora that covers the downstream task at flexible domain granularity and resource availability. We employ it as a pluggable datastore to retrieve informative background knowledge, and integrate them into the standard language model framework to augment representations. We then propose a two-level selection scheme to integrate the most relevant information while alleviating irrelevant noises. Specifically, we introduce a differentiable sampling module as well as an attention mechanism to achieve both passage-level and word-level selection. Such a retrieval-augmented framework enables domain adaptation of language models with flexible domain coverage and fine-grained domain knowledge integration. We conduct comprehensive experiments across biomedical, science and legal domains to demonstrate the effectiveness of the overall framework, and its advantage over existing solutions.
2022
pdf
bib
abs
Hierarchical Representation-based Dynamic Reasoning Network for Biomedical Question Answering
Jianguo Mao
|
Jiyuan Zhang
|
Zengfeng Zeng
|
Weihua Peng
|
Wenbin Jiang
|
Xiangdong Wang
|
Hong Liu
|
Yajuan Lyu
Proceedings of the 29th International Conference on Computational Linguistics
Recently, Biomedical Question Answering (BQA) has attracted growing attention due to its application value and technical challenges. Most existing works treat it as a semantic matching task that predicts answers by computing confidence among questions, options and evidence sentences, which is insufficient for scenarios that require complex reasoning based on a deep understanding of biomedical evidences. We propose a novel model termed Hierarchical Representation-based Dynamic Reasoning Network (HDRN) to tackle this problem. It first constructs the hierarchical representations for biomedical evidences to learn semantics within and among evidences. It then performs dynamic reasoning based on the hierarchical representations of evidences to solve complex biomedical problems. Against the existing state-of-the-art model, the proposed model significantly improves more than 4.5%, 3% and 1.3% on three mainstream BQA datasets, PubMedQA, MedQA-USMLE and NLPEC. The ablation study demonstrates the superiority of each improvement of our model. The code will be released after the paper is published.
pdf
bib
abs
A Transition-based Method for Complex Question Understanding
Yu Xia
|
Wenbin Jiang
|
Yajuan Lyu
|
Sujian Li
Proceedings of the 29th International Conference on Computational Linguistics
Complex Question Understanding (CQU) parses complex questions to Question Decomposition Meaning Representation (QDMR) which is a sequence of atomic operators. Existing works are based on end-to-end neural models which do not explicitly model the intermediate states and lack interpretability for the parsing process. Besides, they predict QDMR in a mismatched granularity and do not model the step-wise information which is an essential characteristic of QDMR. To alleviate the issues, we treat QDMR as a computational graph and propose a transition-based method where a decider predicts a sequence of actions to build the graph node-by-node. In this way, the partial graph at each step enables better representation of the intermediate states and better interpretability. At each step, the decider encodes the intermediate state with specially designed encoders and predicts several candidates of the next action and its confidence. For inference, a searcher seeks the optimal graph based on the predictions of the decider to alleviate the error propagation. Experimental results demonstrate the parsing accuracy of our method against several strong baselines. Moreover, our method has transparent and human-readable intermediate results, showing improved interpretability.
pdf
bib
abs
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning
Jianguo Mao
|
Wenbin Jiang
|
Xiangdong Wang
|
Hong Liu
|
Yu Xia
|
Yajuan Lyu
|
QiaoQiao She
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Multi-hop Question Answering is an agent task for testing the reasoning ability. With the development of pre-trained models, the implicit reasoning ability has been surprisingly improved and can even surpass human performance. However, the nature of the black box hinders the construction of explainable intelligent systems. Several researchers have explored explainable neural-symbolic reasoning methods based on question decomposition techniques. The undifferentiable symbolic operations and the error propagation in the reasoning process lead to poor performance. To alleviate it, we propose a simple yet effective Global Differentiable Learning strategy to explore optimal reasoning paths from the latent probability space so that the model learns to solve intermediate reasoning processes without expert annotations. We further design a Dynamic Adaptive Reasoner to enhance the generalization of unseen questions. Our method achieves 17% improvements in F1-score against BreakRC and shows better interpretability. We take a step forward in building interpretable reasoning methods.
pdf
bib
abs
Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering
Jianguo Mao
|
Wenbin Jiang
|
Xiangdong Wang
|
Zhifan Feng
|
Yajuan Lyu
|
Hong Liu
|
Yong Zhu
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Existing video question answering (video QA) models lack the capacity for deep video understanding and flexible multistep reasoning. We propose for video QA a novel model which performs dynamic multistep reasoning between questions and videos. It creates video semantic representation based on the video scene graph composed of semantic elements of the video and semantic relations among these elements. Then, it performs multistep reasoning for better answer decision between the representations of the question and the video, and dynamically integrate the reasoning results. Experiments show the significant advantage of the proposed model against previous methods in accuracy and interpretability. Against the existing state-of-the-art model, the proposed model dramatically improves more than 4%/3.1%/2% on the three widely used video QA datasets, MSRVTT-QA, MSRVTT multi-choice, and TGIF-QA, and displays better interpretability by backtracing along with the attention mechanisms to the video scene graphs.
2020
pdf
bib
abs
Multi-view Classification Model for Knowledge Graph Completion
Wenbin Jiang
|
Mengfei Guo
|
Yufeng Chen
|
Ying Li
|
Jinan Xu
|
Yajuan Lyu
|
Yong Zhu
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Most previous work on knowledge graph completion conducted single-view prediction or calculation for candidate triple evaluation, based only on the content information of the candidate triples. This paper describes a novel multi-view classification model for knowledge graph completion, where multiple classification views are performed based on both content and context information for candidate triple evaluation. Each classification view evaluates the validity of a candidate triple from a specific viewpoint, based on the content information inside the candidate triple and the context information nearby the triple. These classification views are implemented by a unified neural network and the classification predictions are weightedly integrated to obtain the final evaluation. Experiments show that, the multi-view model brings very significant improvements over previous methods, and achieves the new state-of-the-art on two representative datasets. We believe that, the flexibility and the scalability of the multi-view classification model facilitates the introduction of additional information and resources for better performance.
pdf
bib
abs
Knowledge-Enhanced Named Entity Disambiguation for Short Text
Zhifan Feng
|
Qi Wang
|
Wenbin Jiang
|
Yajuan Lyu
|
Yong Zhu
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Named entity disambiguation is an important task that plays the role of bridge between text and knowledge. However, the performance of existing methods drops dramatically for short text, which is widely used in actual application scenarios, such as information retrieval and question answering. In this work, we propose a novel knowledge-enhanced method for named entity disambiguation. Considering the problem of information ambiguity and incompleteness for short text, two kinds of knowledge, factual knowledge graph and conceptual knowledge graph, are introduced to provide additional knowledge for the semantic matching between candidate entity and mention context. Our proposed method achieves significant improvement over previous methods on a large manually annotated short-text dataset, and also achieves the state-of-the-art on three standard datasets. The short-text dataset and the proposed model will be publicly available for research use.
2019
pdf
bib
abs
Machine Reading Comprehension Using Structural Knowledge Graph-aware Network
Delai Qiu
|
Yuanzhe Zhang
|
Xinwei Feng
|
Xiangwen Liao
|
Wenbin Jiang
|
Yajuan Lyu
|
Kang Liu
|
Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Leveraging external knowledge is an emerging trend in machine comprehension task. Previous work usually utilizes knowledge graphs such as ConceptNet as external knowledge, and extracts triples from them to enhance the initial representation of the machine comprehension context. However, such method cannot capture the structural information in the knowledge graph. To this end, we propose a Structural Knowledge Graph-aware Network(SKG) model, constructing sub-graphs for entities in the machine comprehension context. Our method dynamically updates the representation of the knowledge according to the structural information of the constructed sub-graph. Experiments show that SKG achieves state-of-the-art performance on the ReCoRD dataset.
2016
pdf
bib
Automatic Cross-Lingual Similarization of Dependency Grammars for Tree-based Machine Translation
Wenbin Jiang
|
Wen Zhang
|
Jinan Xu
|
Rangjia Cai
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
2015
pdf
bib
Automatic Adaptation of Annotations
Wenbin Jiang
|
Yajuan Lü
|
Liang Huang
|
Qun Liu
Computational Linguistics, Volume 41, Issue 1 - March 2015
pdf
bib
Encoding Source Language with Convolutional Neural Network for Machine Translation
Fandong Meng
|
Zhengdong Lu
|
Mingxuan Wang
|
Hang Li
|
Wenbin Jiang
|
Qun Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
pdf
bib
genCNN: A Convolutional Architecture for Word Sequence Prediction
Mingxuan Wang
|
Zhengdong Lu
|
Hang Li
|
Wenbin Jiang
|
Qun Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
2014
pdf
bib
A Dependency Edge-based Transfer Model for Statistical Machine Translation
Hongshen Chen
|
Jun Xie
|
Fandong Meng
|
Wenbin Jiang
|
Qun Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
bib
RED: A Reference Dependency Based MT Evaluation Metric
Hui Yu
|
Xiaofeng Wu
|
Jun Xie
|
Wenbin Jiang
|
Qun Liu
|
Shouxun Lin
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
bib
Modeling Term Translation for Document-informed Machine Translation
Fandong Meng
|
Deyi Xiong
|
Wenbin Jiang
|
Qun Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
2013
pdf
bib
Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
Wenbin Jiang
|
Meng Sun
|
Yajuan Lü
|
Yating Yang
|
Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pdf
bib
Bilingually-Guided Monolingual Dependency Grammar Induction
Kai Liu
|
Yajuan Lü
|
Wenbin Jiang
|
Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pdf
bib
Iterative Transformation of Annotation Guidelines for Constituency Parsing
Xiang Li
|
Wenbin Jiang
|
Yajuan Lü
|
Qun Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
2012
pdf
bib
Discriminative Boosting from Dictionary and Raw Text – A Novel Approach to Build A Chinese Word Segmenter
Fandong Meng
|
Wenbin Jiang
|
Hao Xiong
|
Qun Liu
Proceedings of COLING 2012: Posters
pdf
bib
Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
Wenbin Jiang
|
Fandong Meng
|
Qun Liu
|
Yajuan Lü
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
2011
pdf
bib
Relaxed Cross-lingual Projection of Constituent Syntax
Wenbin Jiang
|
Qun Liu
|
Yajuan Lv
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
2010
pdf
bib
Effective Constituent Projection across Languages
Wenbin Jiang
|
Yajuan Lv
|
Yang Liu
|
Qun Liu
Coling 2010: Posters
pdf
bib
Dependency Parsing and Projection Based on Word-Pair Classification
Wenbin Jiang
|
Qun Liu
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
2009
pdf
bib
Bilingually-Constrained (Monolingual) Shift-Reduce Parsing
Liang Huang
|
Wenbin Jiang
|
Qun Liu
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study
Wenbin Jiang
|
Liang Huang
|
Qun Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
pdf
bib
Automatic Adaptation of Annotation Standards for Dependency Parsing ? Using Projected Treebank as Source Corpus
Wenbin Jiang
|
Qun Liu
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)
2008
pdf
bib
abs
The ICT system description for IWSLT 2008.
Yang Liu
|
Zhongjun He
|
Haitao Mi
|
Yun Huang
|
Yang Feng
|
Wenbin Jiang
|
Yajuan Lu
|
Qun Liu
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
This paper presents a description for the ICT systems involved in the IWSLT 2008 evaluation campaign. This year, we participated in Chinese-English and English-Chinese translation directions. Four statistical machine translation systems were used: one linguistically syntax-based, two formally syntax-based, and one phrase-based. The outputs of the four SMT systems were fed to a sentence-level system combiner, which was expected to produce better translations than single systems. We will report the results of the four single systems and the combiner on both the development and test sets.
pdf
bib
Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
Wenbin Jiang
|
Haitao Mi
|
Qun Liu
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)
pdf
bib
A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
Wenbin Jiang
|
Liang Huang
|
Qun Liu
|
Yajuan Lü
Proceedings of ACL-08: HLT