Neural networks are known to be vulnerable to adversarial examples due to slightly perturbed input data. In practical applications of neural network models, the robustness of the models against perturbations must be evaluated. However, no method can strictly evaluate their robustness in natural language domains. We therefore propose a method that evaluates the robustness of text classification models using an integer linear programming (ILP) solver by an optimization problem that identifies a minimum synonym swap that changes the classification result. Our method allows us to compare the robustness of various models in realistic time. It can also be used for obtaining adversarial examples. Because of the minimal impact on the altered sentences, adversarial examples with our method obtained high scores in human evaluations of grammatical correctness and semantic similarity for an IMDb dataset. In addition, we implemented adversarial training with the IMDb and SST2 datasets and found that our adversarial training method makes the model robust.
We propose a novel method of automatic sentence alignment from noisy parallel documents. We first formalize the sentence alignment problem as the independent predictions of spans in the target document from sentences in the source document. We then introduce a total optimization method using integer linear programming to prevent span overlapping and obtain non-monotonic alignments. We implement cross-language span prediction by fine-tuning pre-trained multilingual language models based on BERT architecture and train them using pseudo-labeled data obtained from unsupervised sentence alignment method. While the baseline methods use sentence embeddings and assume monotonic alignment, our method can capture the token-to-token interaction between the tokens of source and target text and handle non-monotonic alignments. In sentence alignment experiments on English-Japanese, our method achieved 70.3 F1 scores, which are +8.0 points higher than the baseline method. In particular, our method improved by +53.9 F1 scores for extracting non-parallel sentences. Our method improved the downstream machine translation accuracy by 4.1 BLEU scores when the extracted bilingual sentences are used for fine-tuning a pre-trained Japanese-to-English translation model.
We present a novel supervised word alignment method based on cross-language span prediction. We first formalize a word alignment problem as a collection of independent predictions from a token in the source sentence to a span in the target sentence. Since this step is equivalent to a SQuAD v2.0 style question answering task, we solve it using the multilingual BERT, which is fine-tuned on manually created gold word alignment data. It is nontrivial to obtain accurate alignment from a set of independently predicted spans. We greatly improved the word alignment accuracy by adding to the question the source token’s context and symmetrizing two directional predictions. In experiments using five word alignment datasets from among Chinese, Japanese, German, Romanian, French, and English, we show that our proposed method significantly outperformed previous supervised and unsupervised word alignment methods without any bitexts for pretraining. For example, we achieved 86.7 F1 score for the Chinese-English data, which is 13.3 points higher than the previous state-of-the-art supervised method.
An anagram is a sentence or a phrase that is made by permutating the characters of an input sentence or a phrase. For example, “Trims cash” is an anagram of “Christmas”. Existing automatic anagram generation methods can find possible combinations of words form an anagram. However, they do not pay much attention to the naturalness of the generated anagrams. In this paper, we show that simple depth-first search can yield natural anagrams when it is combined with modern neural language models. Human evaluation results show that the proposed method can generate significantly more natural anagrams than baseline methods.
Submodular maximization with the greedy algorithm has been studied as an effective approach to extractive summarization. This approach is known to have three advantages: its applicability to many useful submodular objective functions, the efficiency of the greedy algorithm, and the provable performance guarantee. However, when it comes to compressive summarization, we are currently missing a counterpart of the extractive method based on submodularity. In this paper, we propose a fast greedy method for compressive summarization. Our method is applicable to any monotone submodular objective function, including many functions well-suited for document summarization. We provide an approximation guarantee of our greedy algorithm. Experiments show that our method is about 100 to 400 times faster than an existing method based on integer-linear-programming (ILP) formulations and that our method empirically achieves more than 95%-approximation.
To analyze the limitations and the future directions of the extractive summarization paradigm, this paper proposes an Integer Linear Programming (ILP) formulation to obtain extractive oracle summaries in terms of ROUGE-N. We also propose an algorithm that enumerates all of the oracle summaries for a set of reference summaries to exploit F-measures that evaluate which system summaries contain how many sentences that are extracted as an oracle summary. Our experimental results obtained from Document Understanding Conference (DUC) corpora demonstrated the following: (1) room still exists to improve the performance of extractive summarization; (2) the F-measures derived from the enumerated oracle summaries have significantly stronger correlations with human judgment than those derived from single oracle summaries.
This paper derives an Integer Linear Programming (ILP) formulation to obtain an oracle summary of the compressive summarization paradigm in terms of ROUGE. The oracle summary is essential to reveal the upper bound performance of the paradigm. Experimental results on the DUC dataset showed that ROUGE scores of compressive oracles are significantly higher than those of extractive oracles and state-of-the-art summarization systems. These results reveal that compressive summarization is a promising paradigm and encourage us to continue with the research to produce informative summaries.
Summarization aims to represent source documents by a shortened passage. Existing methods focus on the extraction of key information, but often neglect coherence. Hence the generated summaries suffer from a lack of readability. To address this problem, we have developed a graph-based method by exploring the links between text to produce coherent summaries. Our approach involves finding a sequence of sentences that best represent the key information in a coherent way. In contrast to the previous methods that focus only on salience, the proposed method addresses both coherence and informativeness based on textual linkages. We conduct experiments on the DUC2004 summarization task data set. A performance comparison reveals that the summaries generated by the proposed system achieve comparable results in terms of the ROUGE metric, and show improvements in readability by human evaluation.