Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. Different from previous works that make use of mutually similar but redundant translation memories (TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gain in three phases. First, in TM retrieval phase, we adopt contrastive retrieval algorithm to avoid redundancy and uninformativeness of similar translation pieces. Second, in memory encoding stage, given a set of TMs we propose a novel Hierarchical Group Attention module to gather both local context of each TM and global context of the whole TM set. Finally, in training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence. Experimental results show that our framework obtains substantial improvements over strong baselines in the benchmark dataset.
In a citation graph, adjacent paper nodes share related scientific terms and topics. The graph thus conveys unique structure information of document-level relatedness that can be utilized in the paper summarization task, for exploring beyond the intra-document information.In this work, we focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.We first propose a Multi-granularity Unsupervised Summarization model (MUS) as a simple and low-cost solution to the task.MUS finetunes a pre-trained encoder model on the citation graph by link prediction tasks.Then, the abstract sentences are extracted from the corresponding paper considering multi-granularity information.Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.Motivated by this, we next propose a Graph-based Supervised Summarizationmodel (GSS) to achieve more accurate results on the task when large-scale labeled data are available.Apart from employing the link prediction as an auxiliary task, GSS introduces a gated sentence encoder and a graph information fusion module to take advantage of the graph information to polish the sentence representation.Experiments on a public benchmark dataset show that MUS and GSS bring substantial improvements over the prior state-of-the-art model.
Procedural text is a widely used genre that contains many steps of instructions of how to cook a dish or how to conduct a chemical experiment and analyze the procedural text has become a popular task in the NLP field. Since the procedural text can be very long and contains many details, summarizing the whole procedural text or giving an overview for each complicated procedure step can save time for readers and help them to capture the core information in the text. In this paper, we propose the procedural text summarization task with two summarization granularity: step-view and global-view, which summarizes each step in the procedural text separately or gives an overall summary for all steps respectively. To tackle this task, we propose an Entity-State Graph-based Summarizer (ESGS) which is based on state-of-the-art entity state tracking methods and constructs a heterogeneous graph to aggregate contextual information for each procedure. In order to help the summarization model focus on the salient entities, we propose to use the contextualized procedure graph representation to predict the salient entities. Experiments conducted on two datasets verify the effectiveness of our proposed model. Our code and datasets will be released on https://github.com/gsh199449/procedural-summ.
Given a set of related publications, related work section generation aims to provide researchers with an overview of the specific research area by summarizing these works and introducing them in a logical order. Most of existing related work generation models follow the inflexible extractive style, which directly extract sentences from multiple original papers to form a related work discussion. Hence, in this paper, we propose a Relation-aware Related work Generator (RRG), which generates an abstractive related work from the given multiple scientific papers in the same research area. Concretely, we propose a relation-aware multi-document encoder that relates one document to another according to their content dependency in a relation graph. The relation graph and the document representation are interacted and polished iteratively, complementing each other in the training process. We also contribute two public datasets composed of related work sections and their corresponding papers. Extensive experiments on the two datasets show that the proposed model brings substantial improvements over several strong baselines. We hope that this work will promote advances in related work generation task.
Sequential information, a.k.a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders. However, is it possible to encode natural languages without orders? Given a bag of words from a disordered sentence, humans may still be able to understand what those words mean by reordering or reconstructing them. Inspired by such an intuition, in this paper, we perform a study to investigate how “order” information takes effects in natural language learning. By running comprehensive comparisons, we quantitatively compare the ability of several representative neural models to organize sentences from a bag of words under three typical scenarios, and summarize some empirical findings and challenges, which can shed light on future research on this line of work.
As the E-commerce thrives, high-quality online advertising copywriting has attracted more and more attention. Different from the advertising copywriting for a single product, an advertisement (AD) post includes an attractive topic that meets the customer needs and description copywriting about several products under its topic. A good AD post can highlight the characteristics of each product, thus helps customers make a good choice among candidate products. Hence, multi-product AD post generation is meaningful and important. We propose a novel end-to-end model named S-MG Net to generate the AD post. Targeted at such a challenging real-world problem, we split the AD post generation task into two subprocesses: (1) select a set of products via the SelectNet (Selection Network). (2) generate a post including selected products via the MGenNet (Multi-Generator Network). Concretely, SelectNet first captures the post topic and the relationship among the products to output the representative products. Then, MGenNet generates the description copywriting of each product. Experiments conducted on a large-scale real-world AD post dataset demonstrate that our proposed model achieves impressive performance in terms of both automatic metrics as well as human evaluations.
A popular multimedia news format nowadays is providing users with a lively video and a corresponding news article, which is employed by influential news media including CNN, BBC, and social media including Twitter and Weibo. In such a case, automatically choosing a proper cover frame of the video and generating an appropriate textual summary of the article can help editors save time, and readers make the decision more effectively. Hence, in this paper, we propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO) to tackle such a problem. The main challenge in this task is to jointly model the temporal dependency of video with semantic meaning of article. To this end, we propose a Dual-Interaction-based Multimodal Summarizer (DIMS), consisting of a dual interaction module and multimodal generator. In the dual interaction module, we propose a conditional self-attention mechanism that captures local semantic information within video and a global-attention mechanism that handles the semantic relationship between news text and video from a high level. Extensive experiments conducted on a large-scale real-world VMSMO dataset show that DIMS achieves the state-of-the-art performance in terms of both automatic metrics and human evaluations.
Under special circumstances, summaries should conform to a particular style with patterns, such as court judgments and abstracts in academic papers. To this end, the prototype document-summary pairs can be utilized to generate better summaries. There are two main challenges in this task: (1) the model needs to incorporate learned patterns from the prototype, but (2) should avoid copying contents other than the patternized words—such as irrelevant facts—into the generated summaries. To tackle these challenges, we design a model named Prototype Editing based Summary Generator (PESG). PESG first learns summary patterns and prototype facts by analyzing the correlation between a prototype document and its summary. Prototype facts are then utilized to help extract facts from the input document. Next, an editing generator generates new summary based on the summary pattern or extracted facts. Finally, to address the second challenge, a fact checker is used to estimate mutual information between the input document and generated summary, providing an additional signal for the generator. Extensive experiments conducted on a large-scale real-world text summarization dataset show that PESG achieves the state-of-the-art performance in terms of both automatic metrics and human evaluations.
In this paper, we introduce Iterative Text Summarization (ITS), an iteration-based model for supervised extractive text summarization, inspired by the observation that it is often necessary for a human to read an article multiple times in order to fully understand and summarize its contents. Current summarization approaches read through a document only once to generate a document representation, resulting in a sub-optimal representation. To address this issue we introduce a model which iteratively polishes the document representation on many passes through the document. As part of our model, we also introduce a selective reading mechanism that decides more accurately the extent to which each sentence in the model should be updated. Experimental results on the CNN/DailyMail and DUC2002 datasets demonstrate that our model significantly outperforms state-of-the-art extractive systems when evaluated by machines and by humans.