Math Word Problem (MWP) is a crucial NLP task aimed at providing solutions for given mathematical descriptions. A notable sub-category of MWP is the Linear Programming Word Problem (LPWP), which holds significant relevance in real-world decision-making and operations research. While the recent rise of generative large language models (LLMs) has brought more advanced solutions to LPWPs, existing evaluation methodologies for this task still diverge from human judgment and face challenges in recognizing mathematically equivalent answers. In this paper, we introduce a novel evaluation metric rooted in graph edit distance, featuring benefits such as permutation invariance and more accurate program equivalence identification. Human evaluations empirically validate the superior efficacy of our proposed metric when particularly assessing LLM-based solutions for LPWP.
Document-level Relation Extraction (RE) aims to extract relation triples from documents. Existing document-RE models typically rely on supervised learning which requires substantial labeled data. To alleviate the amount of human supervision, Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of big pre-trained models whenever labeled data is insufficient. However, existing ST methods in RE fail to tackle the challenge of long-tail relations. In this work, we propose DuRE, a novel ST framework to tackle these problems. DuRE jointly models RE classification and text generation as a dual process. In this way, our model could construct and utilize both pseudo text generated from given labels and pseudo labels predicted from available unlabeled text, which are gradually refined during the ST phase. We proposed a contrastive loss to leverage the signal of the RE classifier to improve generation quality. In addition, we propose a self-adaptive way to sample pseudo text from different relation classes. Experiments on two document-level RE tasks show that DuRE significantly boosts recall and F1 score with comparable precision, especially for long-tail relations against several strong baselines.
Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of big pre-trained models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable language generation. Augmented only by self-generated pseudo text, generation models over-exploit the previously learned text space and fail to explore a larger one, suffering from a restricted generalization boundary and limited controllability. In this work, we propose DuNST, a novel ST framework to tackle these problems. DuNST jointly models text generation and classification as a dual process and further perturbs and escapes from the collapsed space by adding two kinds of flexible noise. In this way, our model could construct and utilize both pseudo text generated from given labels and pseudo labels predicted from available unlabeled text, which are gradually refined during the ST phase. We theoretically demonstrate that DuNST can be regarded as enhancing the exploration of the potentially larger real text space while maintaining exploitation, guaranteeing improved performance. Experiments on three controllable generation tasks show that DuNST significantly boosts control accuracy with comparable generation fluency and diversity against several strong baselines.
This paper describes our system for SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning. To accomplish this task, we utilize the Knowledge-Enhanced Graph Attention Network (KEGAT) architecture with a novel semantic space transformation strategy. It leverages heterogeneous knowledge to learn adequate evidences, and seeks for an effective semantic space of abstract concepts to better improve the ability of a machine in understanding the abstract meaning of natural language. Experimental results show that our system achieves strong performance on this task in terms of both imperceptibility and nonspecificity.