Zhiguo Wang - ACL Anthology

Zhiguo Wang

2025

On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen | Jiang Guo | Yi Zhang | Jiarong Jiang | Zhiguo Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper investigates synthetic data generation strategies in developing generative retrieval models for domain-specific corpora, thereby addressing the scalability challenges inherent in manually annotating in-domain queries. We study the data strategies for a two-stage training framework: in the first stage, which focuses on learning to decode document identifiers from queries, we investigate LLM-generated queries across multiple granularity (e.g. chunks, sentences) and domain-relevant search constraints that can better capture nuanced relevancy signals. In the second stage, which aims to refine document ranking through preference learning, we explore the strategies for mining hard negatives based on the initial model’s predictions. Experiments on public datasets over diverse domains demonstrate the effectiveness of our synthetic data generation and hard negative sampling approach.

CiteEval: Principle-Driven Citation Evaluation for Source Attribution
Yumo Xu | Peng Qi | Jifan Chen | Kunlun Liu | Rujun Han | Lan Liu | Bonan Min | Vittorio Castelli | Arshit Gupta | Zhiguo Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Citation quality is crucial in information-seeking systems, directly influencing trust and the effectiveness of information access. Current evaluation frameworks, both human and automatic, mainly rely on Natural Language Inference (NLI) to assess binary or ternary supportiveness from cited sources, which we argue is a suboptimal proxy for citation evaluation. In this work we introduce CiteEval, a citation evaluation framework driven by principles focusing on fine-grained citation assessment within a broad context, encompassing not only the cited sources but the full retrieval context, user query, and generated text. Guided by the proposed framework, we construct CiteBench, a multi-domain benchmark with high-quality human annotations on citation quality. To enable efficient evaluation, we further develop CiteEval-Auto, a suite of model-based metrics that exhibit strong correlation with human judgments. Experiments across diverse systems demonstrate CiteEval-Auto’s superior ability to capture the multifaceted nature of citations compared to existing metrics, offering a principled and scalable approach to evaluate and improve model-generated citations.

PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries
Mingwen Dong | Nischal Ashok Kumar | Yiqun Hu | Anuj Chauhan | Chung-Wei Hang | Shuaichen Chang | Lin Pan | Wuwei Lan | Henghui Zhu | Jiarong Jiang | Patrick Ng | Zhiguo Wang
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered. However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data. In this work, we construct a practical conversational text-to-SQL dataset called PRACTIQ, consisting of ambiguous and unanswerable questions inspired by real-world user questions. We first identified four categories of ambiguous questions and four categories of unanswerable questions by studying existing text-to-SQL datasets. Then, we generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user’s clarification, and the assistant’s clarified SQL response with the natural language explanation of the execution results. For some ambiguous queries, we also directly generate helpful SQL responses, that consider multiple aspects of ambiguity, instead of requesting user clarification. To benchmark the performance on ambiguous, unanswerable, and answerable questions, we implemented large language model (LLM)-based baselines using various LLMs. Our approach involves two steps: question category classification and clarification SQL prediction. Our experiments reveal that state-of-the-art systems struggle to handle ambiguous and unanswerable questions effectively. We release our code for data generation and experiments on GitHub.

You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL
Hideo Kobayashi | Wuwei Lan | Peng Shi | Shuaichen Chang | Jiang Guo | Henghui Zhu | Zhiguo Wang | Patrick Ng
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO’s competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.

2024

Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks
Wenyue Hua | Jiang Guo | Mingwen Dong | Henghui Zhu | Patrick Ng | Zhiguo Wang
Findings of the Association for Computational Linguistics: ACL 2024

Current knowledge editing approaches struggle to effectively propagate updates to interconnected facts.In this work, we delve into the barriers that hinder the appropriate propagation of updated knowledge within these models for accurate reasoning. To support our analysis, we introduce a novel reasoning-based benchmark, ReCoE (Reasoning-based Counterfactual Editing dataset), which covers six common reasoning schemes in the real world. We conduct an extensive analysis of existing knowledge editing techniques, including input-augmentation, finetuning, and locate-and-edit methods. We found that all model editing methods exhibit notably low performance on this dataset, especially within certain reasoning schemes. Our analysis of the chain-of-thought responses from edited models indicate that, while the models effectively update individual facts, they struggle to recall these facts in reasoning tasks. Moreover, locate-and-edit methods severely deteriorate the models’ language modeling capabilities, leading to poor perplexity and logical coherence in their outputs.

2023

XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations
Yusen Zhang | Jun Wang | Zhiguo Wang | Rui Zhang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Cross-Lingual Semantic Parsing (CLSP) aims to translate queries in multiple natural languages (NLs) into meaning representations (MRs) such as SQL, lambda calculus, and logic forms. However, existing CLSP models are separately proposed and evaluated on datasets of limited tasks and applications, impeding a comprehensive and unified evaluation of CLSP on a diverse range of NLs and MRs. To this end, we present XSemPLR, a unified benchmark for cross-lingual semantic parsing featured with 22 natural languages and 8 meaning representations by examining and selecting 9 existing datasets to cover 5 tasks and 164 domains. We use XSemPLR to conduct a comprehensive benchmark study on a wide range of multilingual language models including encoder-based models (mBERT, XLM-R), encoder-decoder models (mBART, mT5), and decoder-based models (Codex, BLOOM). We design 6 experiment settings covering various lingual combinations (monolingual, multilingual, cross-lingual) and numbers of learning samples (full dataset, few-shot, and zero-shot). Our experiments show that encoder-decoder models (mT5) achieve the highest performance compared with other popular models, and multilingual training can further improve the average performance. Notably, multilingual large language models (e.g., BLOOM) are still inadequate to perform CLSP tasks. We also find that the performance gap between monolingual training and cross-lingual transfer learning is still significant for multilingual models, though it can be mitigated by cross-lingual few-shot training. Our dataset and code are available at https://github.com/psunlpgroup/XSemPLR.

Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning
Alexander Hanbo Li | Mingyue Shang | Evangelia Spiliopoulou | Jie Ma | Patrick Ng | Zhiguo Wang | Bonan Min | William Yang Wang | Kathleen McKeown | Vittorio Castelli | Dan Roth | Bing Xiang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we present a novel approach for data-to-text generation that addresses the limitations of current methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph triples, and meaning representations. We demonstrate that our proposed approach can effectively adapt to new structured forms, and can improve performance in comparison to current methods. For example, our method resulted in a 66% improvement in zero-shot BLEU scores when transferring models trained on table inputs to a knowledge graph dataset. Our proposed method is an important step towards a more general data-to-text generation framework.

There has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed three shortcomings: illogical synthetic SQL queries from independent column sampling, arbitrary table joins, and language gaps between the synthesized SQL and natural language question (NLQ) pair. To address these issues, we propose a novel synthesis framework that imposes strong typing constraints, incorporates key relationships from schema, and conducts schema-distance-weighted column sampling. We also adopt an intermediate representation (IR) for the SQL-to-text task to further improve the quality of the generated NLQ. When existing powerful text-to-SQL parsers are pretrained on our high-quality synthesized data, these models have significant accuracy boosts and achieve new state-of-the-art performance on Spider. We also demonstrate the effectiveness of our techniques with ablation studies

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Xingyu Fu | Sheng Zhang | Gukyeong Kwon | Pramuditha Perera | Henghui Zhu | Yuhao Zhang | Alexander Hanbo Li | William Yang Wang | Zhiguo Wang | Vittorio Castelli | Patrick Ng | Dan Roth | Bing Xiang
Findings of the Association for Computational Linguistics: ACL 2023

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias – the tendency to generate certain tokens over other tokens regardless of prompt changes, and high dependency on the PLM quality – only models using GPT-3 can achieve the best result. To address the aforementioned challenges, we propose RASO: a new VQA pipeline that deploys a generate-then-select strategy guided by world knowledge for the first time. Rather than following the de facto standard to train a multi-modal model that directly generates the VQA answer, {pasted macro ‘MODEL’}name first adopts PLM to generate all the possible answers, and then trains a lightweight answer selection model for the correct answer. As proved in our analysis, RASO expands the knowledge coverage from in-domain training data by a large margin. We provide extensive experimentation and show the effectiveness of our pipeline by advancing the state-of-the-art by 4.1% on OK-VQA, without additional computation cost.

Benchmarking Diverse-Modal Entity Linking with Generative Models
Sijia Wang | Alexander Hanbo Li | Henghui Zhu | Sheng Zhang | Pramuditha Perera | Chung-Wei Hang | Jie Ma | William Yang Wang | Zhiguo Wang | Vittorio Castelli | Bing Xiang | Patrick Ng
Findings of the Association for Computational Linguistics: ACL 2023

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constructed a benchmark for diverse-modal EL (DMEL) from existing EL datasets, covering all three modalities including text, image and table. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. Pre-training GDMM with rich corpora builds a solid foundation for DMEL without storing the entire KB for inference. Fine-tuning GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average. Additionally, extensive error analyses are conducted to highlight the challenge of DMEL, facilitating future researches on this task.

2022

Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding
Jun Wang | Patrick Ng | Alexander Hanbo Li | Jiarong Jiang | Zhiguo Wang | Bing Xiang | Ramesh Nallapati | Sudipta Sengupta
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Most recent research on Text-to-SQL semantic parsing relies on either parser itself or simple heuristic based approach to understand natural language query (NLQ). When synthesizing a SQL query, there is no explicit semantic information of NLQ available to the parser which leads to undesirable generalization performance. In addition, without lexical-level fine-grained query understanding, linking between query and database can only rely on fuzzy string match which leads to suboptimal performance in real applications. In view of this, in this paper we present a general-purpose, modular neural semantic parsing framework that is based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural semantic parser (NSP). By jointly modeling query and database, NER model analyzes user intents and identifies entities in the query. NEL model links typed entities to schema and cell values in database. Parser model leverages available semantic information and linking results and synthesizes tree-structured SQL queries based on dynamically generated grammar. Experiments on SQUALL, a newly released semantic parsing dataset, show that we can achieve 56.8% execution accuracy on WikiTableQuestions (WTQ) test set, which outperforms the state-of-the-art model by 2.7%.

2021

Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction
Yifan Gao | Henghui Zhu | Patrick Ng | Cicero Nogueira dos Santos | Zhiguo Wang | Feng Nan | Dejiao Zhang | Ramesh Nallapati | Andrew O. Arnold | Bing Xiang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find possible interpretations of the question, and predict one or multiple plausible answers. When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambiguity. In this paper, we present a model that aggregates and combines evidence from multiple passages to adaptively predict a single answer or a set of question-answer pairs for ambiguous questions. In addition, we propose a novel round-trip prediction approach to iteratively generate additional interpretations that our model fails to find in the first pass, and then verify and filter out the incorrect question-answer pairs to arrive at the final disambiguated output. Our model, named Refuel, achieves a new state-of-the-art performance on the AmbigQA dataset, and shows competitive performance on NQ-Open and TriviaQA. The proposed round-trip prediction is a model-agnostic general approach for answering ambiguous open-domain questions, which improves our Refuel as well as several baseline models. We release source code for our models and experiments at https://github.com/amzn/refuel-open-domain-qa.

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
Alexander Hanbo Li | Patrick Ng | Peng Xu | Henghui Zhu | Zhiguo Wang | Bing Xiang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The current state-of-the-art generative models for open-domain question answering (ODQA) have focused on generating direct answers from unstructured textual information. However, a large amount of world’s knowledge is stored in structured databases, and need to be accessed using query languages such as SQL. Furthermore, query languages can answer questions that require complex reasoning, as well as offering full explainability. In this paper, we propose a hybrid framework that takes both textual and tabular evidences as input and generates either direct answers or SQL queries depending on which form could better answer the question. The generated SQL queries can then be executed on the associated databases to obtain the final answers. To the best of our knowledge, this is the first paper that applies Text2SQL to ODQA tasks. Empirically, we demonstrate that on several ODQA datasets, the hybrid methods consistently outperforms the baseline models that only takes homogeneous input by a large margin. Specifically we achieve the state-of-the-art performance on OpenSQuAD dataset using a T5-base model. In a detailed analysis, we demonstrate that the being able to generate structural SQL queries can always bring gains, especially for those questions that requires complex reasoning.

Improving Factual Consistency of Abstractive Summarization via Question Answering
Feng Nan | Cicero Nogueira dos Santos | Henghui Zhu | Patrick Ng | Kathleen McKeown | Ramesh Nallapati | Dejiao Zhang | Zhiguo Wang | Andrew O. Arnold | Bing Xiang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.

Retrieval, Re-ranking and Multi-task Learning for Knowledge-Base Question Answering
Zhiguo Wang | Patrick Ng | Ramesh Nallapati | Bing Xiang
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Question answering over knowledge bases (KBQA) usually involves three sub-tasks, namely topic entity detection, entity linking and relation detection. Due to the large number of entities and relations inside knowledge bases (KB), previous work usually utilized sophisticated rules to narrow down the search space and managed only a subset of KBs in memory. In this work, we leverage a retrieve-and-rerank framework to access KBs via traditional information retrieval (IR) method, and re-rank retrieved candidates with more powerful neural networks such as the pre-trained BERT model. Considering the fact that directly assigning a different BERT model for each sub-task may incur prohibitive costs, we propose to share a BERT encoder across all three sub-tasks and define task-specific layers on top of the shared layer. The unified model is then trained under a multi-task learning framework. Experiments show that: (1) Our IR-based retrieval method is able to collect high-quality candidates efficiently, thus enables our method adapt to large-scale KBs easily; (2) the BERT model improves the accuracy across all three sub-tasks; and (3) benefiting from multi-task learning, the unified model obtains further improvements with only 1/3 of the original parameters. Our final model achieves competitive results on the SimpleQuestions dataset and superior performance on the FreebaseQA dataset.

Entity-level Factual Consistency of Abstractive Text Summarization
Feng Nan | Ramesh Nallapati | Zhiguo Wang | Cicero Nogueira dos Santos | Henghui Zhu | Dejiao Zhang | Kathleen McKeown | Bing Xiang
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.

2020

Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Alexander Fabbri | Patrick Ng | Zhiguo Wang | Ramesh Nallapati | Bing Xiang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Question Answering (QA) is in increasing demand as the amount of information available online and the desire for quick access to this content grows. A common approach to QA has been to fine-tune a pretrained language model on a task-specific labeled dataset. This paradigm, however, relies on scarce, and costly to obtain, large-scale human-labeled data. We propose an unsupervised approach to training QA models with generated pseudo-training data. We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance by allowing the model to learn more complex context-question relationships. Training a QA model on this data gives a relative improvement over a previous unsupervised model in F1 score on the SQuAD dataset by about 14%, and 20% when the answer is a named entity, achieving state-of-the-art performance on SQuAD for unsupervised QA.

End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
Siamak Shakeri | Cicero Nogueira dos Santos | Henghui Zhu | Patrick Ng | Feng Nan | Zhiguo Wang | Ramesh Nallapati | Bing Xiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering score, which avoids the need for a separate filtering model. Our generator is trained by fine-tuning a pretrained LM using maximum likelihood estimation. The experimental results indicate significant improvements in the domain adaptation of QA models outperforming current state-of-the-art methods.

2019

Leveraging Dependency Forest for Neural Medical Relation Extraction
Linfeng Song | Yue Zhang | Daniel Gildea | Mo Yu | Zhiguo Wang | Jinsong Su
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Medical relation extraction discovers relations between entity mentions in text, such as research articles. For this task, dependency syntax has been recognized as a crucial source of features. Yet in the medical domain, 1-best parse trees suffer from relatively low accuracies, diminishing their usefulness. We investigate a method to alleviate this problem by utilizing dependency forests. Forests contain more than one possible decisions and therefore have higher recall but more noise compared with 1-best outputs. A graph neural network is used to represent the forests, automatically distinguishing the useful syntactic information from parsing noise. Results on two benchmarks show that our method outperforms the standard tree-based methods, giving the state-of-the-art results in the literature.

Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering
Zhiguo Wang | Patrick Ng | Xiaofei Ma | Ramesh Nallapati | Bing Xiang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

BERT model has been successfully applied to open-domain QA tasks. However, previous work trains BERT by viewing passages corresponding to the same question as independent training instances, which may cause incomparable scores for answers from different passages. To tackle this issue, we propose a multi-passage BERT model to globally normalize answer scores across all passages of the same question, and this change enables our QA model find better answers by utilizing more passages. In addition, we find that splitting articles into passages with the length of 100 words by sliding window improves performance by 4%. By leveraging a passage ranker to select high-quality passages, multi-passage BERT gains additional 2%. Experiments on four standard benchmarks showed that our multi-passage BERT outperforms all state-of-the-art models on all benchmarks. In particular, on the OpenSQuAD dataset, our model gains 21.4% EM and 21.5% F1 over all non-BERT models, and 5.8% EM and 6.5% F1 over BERT-based models.

Domain Adaptation with BERT-based Domain Classification and Data Selection
Xiaofei Ma | Peng Xu | Zhiguo Wang | Ramesh Nallapati | Bing Xiang
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

The performance of deep neural models can deteriorate substantially when there is a domain shift between training and test data. For example, the pre-trained BERT model can be easily fine-tuned with just one additional output layer to create a state-of-the-art model for a wide range of tasks. However, the fine-tuned BERT model suffers considerably at zero-shot when applied to a different domain. In this paper, we present a novel two-step domain adaptation framework based on curriculum learning and domain-discriminative data selection. The domain adaptation is conducted in a mostly unsupervised manner using a small target domain validation set for hyper-parameter tuning. We tested the framework on four large public datasets with different domain similarities and task types. Our framework outperforms a popular discrepancy-based domain adaptation method on most transfer tasks while consuming only a fraction of the training budget.

Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering
Kun Xu | Yuxuan Lai | Yansong Feng | Zhiguo Wang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Traditional Key-value Memory Neural Networks (KV-MemNNs) are proved to be effective to support shallow reasoning over a collection of documents in domain specific Question Answering or Reading Comprehension tasks. However, extending KV-MemNNs to Knowledge Based Question Answering (KB-QA) is not trivia, which should properly decompose a complex question into a sequence of queries against the memory, and update the query representations to support multi-hop reasoning over the memory. In this paper, we propose a novel mechanism to enable conventional KV-MemNNs models to perform interpretable reasoning for complex questions. To achieve this, we design a new query updating strategy to mask previously-addressed memory information from the query representations, and introduce a novel STOP strategy to avoid invalid or repeated memory reading without strong annotation signals. This also enables KV-MemNNs to produce structured queries and work in a semantic parsing fashion. Experimental results on benchmark datasets show that our solution, trained with question-answer pairs only, can provide conventional KV-MemNNs models with better reasoning abilities on complex questions, and achieve state-of-art performances.

Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network
Kun Xu | Liwei Wang | Mo Yu | Yansong Feng | Yan Song | Zhiguo Wang | Dong Yu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Previous cross-lingual knowledge graph (KG) alignment studies rely on entity embeddings derived only from monolingual KG structural information, which may fail at matching entities that have different facts in two KGs. In this paper, we introduce the topic entity graph, a local sub-graph of an entity, to represent entities with their contextual information in KG. From this view, the KB-alignment task can be formulated as a graph matching problem; and we further propose a graph-attention based solution, which first matches all entities in two topic entity graphs, and then jointly model the local matching information to derive a graph-level matching vector. Experiments show that our model outperforms previous state-of-the-art methods by a large margin.

Semantic Neural Machine Translation Using AMR
Linfeng Song | Daniel Gildea | Yue Zhang | Zhiguo Wang | Jinsong Su
Transactions of the Association for Computational Linguistics, Volume 7

It is intuitive that semantic representations can be useful for machine translation, mainly because they can help in enforcing meaning preservation and handling data sparsity (many sentences correspond to one meaning) of machine translation models. On the other hand, little work has been done on leveraging semantics for neural machine translation (NMT). In this work, we study the usefulness of AMR (abstract meaning representation) on NMT. Experiments on a standard English-to-German dataset show that incorporating AMR as additional knowledge can significantly improve a strong attention-based sequence-to-sequence neural translation model.

Multi-Granular Text Encoding for Self-Explaining Categorization
Zhiguo Wang | Yue Zhang | Mo Yu | Wei Zhang | Lin Pan | Linfeng Song | Kun Xu | Yousef El-Kurdi
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence. A popular type of evidence is sub-sequences extracted from the input text which are sufficient for the classifier to make the prediction. In this work, we define multi-granular ngrams as basic units for explanation, and organize all ngrams into a hierarchical structure, so that shorter ngrams can be reused while computing longer ngrams. We leverage the tree-structured LSTM to learn a context-independent representation for each unit via parameter sharing. Experiments on medical disease classification show that our model is more accurate, efficient and compact than the BiLSTM and CNN baselines. More importantly, our model can extract intuitive multi-granular evidence to support its predictions.

2018

Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model
Kun Xu | Lingfei Wu | Zhiguo Wang | Mo Yu | Liwei Chen | Vadim Sheinin
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Existing neural semantic parsers mainly utilize a sequence encoder, i.e., a sequential LSTM, to extract word order features while neglecting other valuable syntactic information such as dependency or constituent trees. In this paper, we first propose to use the syntactic graph to represent three types of syntactic information, i.e., word order, dependency and constituency features; then employ a graph-to-sequence model to encode the syntactic graph and decode a logical form. Experimental results on benchmark datasets show that our model is comparable to the state-of-the-art on Jobs640, ATIS, and Geo880. Experimental results on adversarial examples demonstrate the robustness of the model is also improved by encoding more syntactic information.

SQL-to-Text Generation with Graph-to-Sequence Model
Kun Xu | Lingfei Wu | Zhiguo Wang | Yansong Feng | Vadim Sheinin
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Previous work approaches the SQL-to-text generation task using vanilla Seq2Seq models, which may not fully capture the inherent graph-structured information in SQL query. In this paper, we propose a graph-to-sequence model to encode the global structure information into node embeddings. This model can effectively learn the correlation between the SQL query pattern and its interpretation. Experimental results on the WikiSQL dataset and Stackoverflow dataset show that our model outperforms the Seq2Seq and Tree2Seq baselines, achieving the state-of-the-art performance.

N-ary Relation Extraction using Graph-State LSTM
Linfeng Song | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Cross-sentence n-ary relation extraction detects relations among n entities across multiple sentences. Typical methods formulate an input as a document graph, integrating various intra-sentential and inter-sentential dependencies. The current state-of-the-art method splits the input graph into two DAGs, adopting a DAG-structured LSTM for each. Though being able to model rich linguistic knowledge by leveraging graph edges, important information can be lost in the splitting procedure. We propose a graph-state LSTM model, which uses a parallel state to model each word, recurrently enriching state values via message passing. Compared with DAG LSTMs, our graph LSTM keeps the original graph structure, and speeds up computation by allowing more parallelization. On a standard benchmark, our model shows the best result in the literature.

Leveraging Context Information for Natural Question Generation
Linfeng Song | Zhiguo Wang | Wael Hamza | Yue Zhang | Daniel Gildea
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer. It is useful for enlarging the training set of QA systems. Previous work has adopted sequence-to-sequence models that take a passage with an additional bit to indicate answer position as input. However, they do not explicitly model the information between answer and other context within the passage. We propose a model that matches the answer with the passage before generating the question. Experiments show that our model outperforms the existing state of the art using rich features.

A Graph-to-Sequence Model for AMR-to-Text Generation
Linfeng Song | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph. The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a linearized AMR structure. Although being able to model non-local semantic information, a sequence LSTM can lose information from the AMR graph structure, and thus facing challenges with large-graphs, which result in long sequences. We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. On a standard benchmark, our model shows superior results to existing methods in the literature.

2017

AMR-to-text Generation with Synchronous Node Replacement Grammar
Linfeng Song | Xiaochang Peng | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar. During training, graph-to-string rules are learned using a heuristic extraction algorithm. At test time, a graph transducer is applied to collapse input AMRs and generate output sentences. Evaluated on a standard benchmark, our method gives the state-of-the-art result.

2016

Language Independent Dependency to Constituent Tree Conversion
Young-Suk Lee | Zhiguo Wang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present a dependency to constituent tree conversion technique that aims to improve constituent parsing accuracies by leveraging dependency treebanks available in a wide variety in many languages. The technique works in two steps. First, a partial constituent tree is derived from a dependency tree with a very simple deterministic algorithm that is both language and dependency type independent. Second, a complete high accuracy constituent tree is derived with a constraint-based parser, which uses the partial constituent tree as external constraints. Evaluated on Section 22 of the WSJ Treebank, the technique achieves the state-of-the-art conversion F-score 95.6. When applied to English Universal Dependency treebank and German CoNLL2006 treebank, the converted treebanks added to the human-annotated constituent parser training corpus improve parsing F-scores significantly for both languages.

Sentence Similarity Learning by Lexical Decomposition and Composition
Zhiguo Wang | Haitao Mi | Abraham Ittycheriah
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each word as a vector, and calculates a semantic matching vector for each word based on all words in the other sentence. Then, each word vector is decomposed into a similar component and a dissimilar component based on the semantic matching vector. After this, a two-channel CNN model is employed to capture features by composing the similar and dissimilar components. Finally, a similarity score is estimated over the composed feature vectors. Experimental results show that our model gets the state-of-the-art performance on the answer sentence selection task, and achieves a comparable result on the paraphrase identification task.

Coverage Embedding Models for Neural Machine Translation
Haitao Mi | Baskaran Sankaran | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

AMR-to-text generation as a Traveling Salesman Problem
Linfeng Song | Yue Zhang | Xiaochang Peng | Zhiguo Wang | Daniel Gildea
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Supervised Attentions for Neural Machine Translation
Haitao Mi | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Semi-supervised Clustering for Short Text via Deep Representation Learning
Zhiguo Wang | Haitao Mi | Abraham Ittycheriah
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

Vocabulary Manipulation for Neural Machine Translation
Haitao Mi | Zhiguo Wang | Abe Ittycheriah
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Sense Embedding Learning for Word Sense Induction
Linfeng Song | Zhiguo Wang | Haitao Mi | Daniel Gildea
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

2015

Feature Optimization for Constituent Parsing via Neural Networks
Zhiguo Wang | Haitao Mi | Nianwen Xue
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features
Zhiguo Wang | Nianwen Xue
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
Zhiguo Wang | Chengqing Zong | Nianwen Xue
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Large-scale Word Alignment Using Soft Dependency Cohesion Constraints
Zhiguo Wang | Chengqing Zong
Transactions of the Association for Computational Linguistics, Volume 1

Dependency cohesion refers to the observation that phrases dominated by disjoint dependency subtrees in the source language generally do not overlap in the target language. It has been verified to be a useful constraint for word alignment. However, previous work either treats this as a hard constraint or uses it as a feature in discriminative models, which is ineffective for large-scale tasks. In this paper, we take dependency cohesion as a soft constraint, and integrate it into a generative model for large-scale word alignment experiments. We also propose an approximate EM algorithm and a Gibbs sampling algorithm to estimate model parameters in an unsupervised manner. Experiments on large-scale Chinese-English translation tasks demonstrate that our model achieves improvements in both alignment quality and translation quality.

2011

Parse Reranking Based on Higher-Order Lexical Dependencies
Zhiguo Wang | Chengqing Zong
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

Phrase Structure Parsing with Dependency Structure
Zhiguo Wang | Chengqing Zong
Coling 2010: Posters

Treebank Conversion based Self-training Strategy for Parsing
Zhiguo Wang | Chengqing Zong
CIPS-SIGHAN Joint Conference on Chinese Language Processing

Co-authors

Daniel Gildea 8

Alexander Hanbo Li 6

Vittorio Castelli 5

Chengqing Zong 5

Jiarong Jiang 4

Cicero dos Santos 4

Chung-Wei Hang 3

Abe Ittycheriah 3

Kathleen McKeown 3

William Yang Wang 3

Andrew O. Arnold 2

Shuaichen Chang 2

Abraham Ittycheriah 2

Xiaochang Peng 2

Pramuditha Perera 2

Vadim Sheinin 2

Nischal Ashok Kumar 1

Yousef El-Kurdi 1

Hideo Kobayashi 1

Gukyeong Kwon 1

Young-Suk Lee 1

Joseph Lilien 1

Alexander Richard Fabbri 1

Baskaran Sankaran 1

Sudipta Sengupta 1

Siamak Shakeri 1

Mingyue Shang 1

Evangelia Spiliopoulou 1

Dong Yu (于东) 1

Venues