Improving Unsupervised Relation Extraction by Augmenting Diverse Sentence Pairs

Unsupervised relation extraction (URE) aims to extract relations between named entities from raw text without requiring manual annotations or pre-existing knowledge bases. In recent studies of URE, researchers put a notable emphasis on contrastive learning strategies for acquiring relation representations. However, these studies often overlook two important aspects: the inclusion of diverse positive pairs for contrastive learning and the exploration of appropriate loss functions. In this paper, we propose AugURE with both within-sentence pairs augmentation and augmentation through cross-sentence pairs extraction to increase the diversity of positive pairs and strengthen the discriminative power of contrastive learning. We also identify the limitation of noise-contrastive estimation (NCE) loss for relation representation learning and propose to apply margin loss for sentence pairs. Experiments on NYT-FB and TACRED datasets demonstrate that the proposed relation representation learning and a simple K-Means clustering achieves state-of-the-art performance.


Introduction
Unsupervised Relation Extraction (URE) aims to extract relations between named entities from the raw text in an unsupervised manner without relying on annotated relations from human experts.URE tasks are proposed to overcome the limitations of traditional Relation Extraction (RE) tasks, which target to extract pre-defined relations and require a time-consuming and labor-intensive data annotation process.
To discover relational facts without any predefined relation types, URE tasks are often formulated as semantics clustering tasks, where the entity pairs with the same relations should be clustered together.Therefore, effectively learning the relation representations is an essential step toward solving URE tasks.Recent studies on relation representation learning (Liu et al., 2021;Zhang et al., 2021;Liu et al., 2022) adopt a contrastive learning process.Contrastive learning aims to bring similar-relation sentences (positive pairs) closer while separating different-relation sentences (negative pairs).There are several ideas for constructing positive pairs, a key step in the contrastive learning process.Some methods propose to augment the entity pairs and relation context of the original sentences as positive examples (Liu et al., 2021(Liu et al., , 2022)).To avoid similar sentences being pushed apart in the contrastive learning process, Liu et al. (2022) further propose to utilize cluster centroids to generate positive pairs within the clusters.However, these approaches often fail to provide diverse positive pairs for contrastive learning to obtain more effective relation representations.
Another challenge arises from the loss function perspective.The most recent URE work (Liu et al., 2022) directly transfers the Noise-Contrastive Estimation (NCE) loss function from computer vision tasks (He et al., 2020;Caron et al., 2020;Li et al., 2021) and achieves good performance.However, the negative examples used by the NCE loss function are randomly sampled sentences from the same minibatch, which may result in high false negatives, especially when the total number of relational clusters is not large.Moreover, relation semantics should not be regarded as "same/different" in many cases but rather as a similarity spectrum.For example, the relation org:top_members/employees is semantically closer to org:members than org:founded_by.Therefore, we argue that the binary NCE loss function may suit poorly for relation representation learning in URE tasks, whereas a margin-based loss may fit better.
In this paper, we propose several techniques to augment positive pairs to increase their diversity and strengthen the discriminative power of con-trastive learning.Specifically, we propose withinsentence pairs augmentation and augmentation through cross-sentence pairs extraction.Withinsentence pairs augmentation generates positive pairs from the same sentence using intermediate words sampling and entity pair replacement techniques.In the augmentation through cross-sentence pairs extraction, we aim to recognize sentences with the same relation semantics to form a more diverse positive sample pool.Specifically, we employ a three-step approach.First, we utilize the Open Information Extraction (OpenIE) model (Angeli et al., 2015) to extract relation templates among all sentences.Intuitively, sentences with the same relation template are likely to express the same relation.To further group the templates that share the same relation semantics, we employ Natural Language Inference (NLI) (Zhuang et al., 2021) to discover mutually entailed templates.Finally, we extract positive pairs from different sentences within the same template groups.To facilitate the extraction of more cross-sentence positive pairs, we further use ChatGPT to rewrite the original sentences.To overcome the limitations of the binary NCE loss function, we propose to use margin loss to capture the spectrum of semantic similarity between positive and negative pairs.In summary, our main contributions are: • We propose AugURE with both withinsentence pairs augmentation and augmentation through cross-sentence pairs extraction to increase the diversity of positive pairs and enhance the discriminative power of contrastive learning for relation representations.
• We identify the limitation of NCE loss for relation representation learning and propose to apply margin loss for sentence pairs.
• Experiments demonstrate that the combination of our relation representation learning and a straightforward K-Means clustering approach achieves state-of-the-art performance on NYT-FB and TACRED datasets.

Related Work
Relation Extraction is a crucial Natural Language Processing (NLP) task and has been extensively studied in the past years (Miller et al., 1998;Zelenko et al., 2002;Zhong and Chen, 2021;Sui et al., 2023).Although existing RE methods achieve decent performance with manual annotations and KBs, those annotations can be expensive to obtain in practice.In recent years, few-shot RE and zeroshot RE (Sainz et al., 2021;Lu et al., 2022;Zhou et al., 2022) become popular, which significantly reduce the human annotation cost.However, fewshot RE and zero-shot RE still require a pre-defined relation label space which is often unavailable in new datasets or open-domain scenarios.
An upsurge of interest in URE has been witnessed to extract relational facts without predefined relation types and human-annotated data.The task of OpenRE has also attracted considerable attention from researchers.Unlike URE, OpenRE has additional access to existing annotated relations and aims to extract novel relational facts from opendomain corpora.Relation representation learning is a key step for both tasks.Amongst the prior studies, Tran et al. (2020) find that the utilization of named entities alone to induce relation types can surpass the performance of previous methods.This observation underscores the significant inductive bias that entity types offer for URE.Later, Hu et al. (2020) assume the cluster assignments as pseudo-labels and propose a self-supervised learning framework that leverages pre-trained language models (LMs) to learn and refine contextualized entity pair representations.However, the frequently re-initialized linear classification layers can hinder the process of representation learning.Wang et al. (2021) are the first to utilize deep metric learning in the OpenRE task and this scheme demonstrates good capabilities in representation learning.
Motivated by the achievements of contrastive learning in computer vision tasks (He et al., 2020;Caron et al., 2020;Li et al., 2021), recent state-ofthe-art methods have embraced contrastive learning frameworks to acquire relation representations.Zhang et al. (2021) integrate existing hierarchical information into relation representations by a contrastive objective function.Liu et al. (2021) intervene on the named entities and context to obtain the underlying causal effects of them.Both methods adopt instance-wise contrastive learning objectives and exploit base-level sentence differences by increasing the diversity of positive examples.To further capture high-level semantics in the relation features, Liu et al. (2022) introduce a hierarchical exemplar contrastive learning schema to optimize relation representations using both instance-wise and exemplar-wise signals.However, this approach still lacks in providing diverse positive pairs for contrastive learning, limiting the effectiveness of relation representations.Additionally, it employs the NCE loss function without considering its alignment with the task requirements.

URE Task Definition
We formalize the URE task as follows.Let x = [x 1 , ..., x n ] denote a sentence, where x i represents the i-th (1 ≤ i ≤ n) token.In the sentence, a named entity pair (e h , e t ) is recognized in advance, where e h = [x hs , ..., x he ] represents the head entity with its start and end indices denoted as hs and he, respectively, while e t = [x ts , ..., x te ] represents the tail entity with its start and end indices denoted as ts and te, respectively.Let D = [(x 1 , e h 1 , e t 1 ), ..., (x N , e h N , e t N )] be a corpus consists of N sentences along with their target entity pairs, the goal of URE is to partition the N sentences into k clusters, where k is a predefined number of relations in the corpus.

Contrastive Learning
Recent studies have demonstrated the effectiveness of contrastive learning strategies to learn relation representations (Liu et al., 2021;Zhang et al., 2021;Liu et al., 2022).In these approaches, positive pairs are generally defined as two sentences with the same relation and negative pairs are two sentences with different relations.These positive and negative pairs are constructed and utilized in the contrastive objective function to bring together sentences with similar relations while separating sentences with dissimilar relations.Liu et al. (2022) achieve remarkable success by directly adopting the Noise-Contrastive Estimation (NCE) loss function from computer vision tasks to URE's representation learning.The InfoNCE loss function (Oord et al., 2018;He et al., 2020;Li et al., 2021), which is usually used for instancewise contrastive learning, is defined as: where v i and v ′ i are positive embeddings for instance i, v ′ j includes one positive embedding and J negative embeddings for other instances, and τ is a temperature hyper-parameter.

Methodology
In this section, we provide a comprehensive exploration of the proposed model AugURE.Figure 1 offers an overview of our framework.As mentioned in Section 1, both positive and negative examples are important for relation representation learning in URE tasks.When handling positive examples, we proposed several augmentation techniques to generate more diverse positive pairs.When handling negative examples, we argue that relation semantics should not be simply viewed as 'same/different' but rather as a similarity spectrum.To this end, we propose to apply margin loss, which is also more robust to false negatives.

Relation Encoder
Given a sentence, its corresponding named entities, and entity types, the relation encoder aims to output a vector that depicts the relation between the entities.To highlight entities of interest, we decide to follow the entity marker tokens presented in Soares et al. ( 2019) and Xiao et al. (2020) to indicate the beginning and the end of each entity.Moreover, certain relations have been observed to exist between entities of specific types.Thus the entity type information is important guidance for the URE task and we should also incorporate the context of the entity type within each entity marker.
(3) Finally, to capture the relation contextual information and place more emphasis on the target entity pair, we derive the following fixed-length vector: to represent the relation expressed in x between the marked entities, where a|b represents the concatenation of a and b.

Positive Sample Augmentation
As discussed in section 1, previous methods fail to provide diverse positive pairs for contrastive learning to obtain more effective relation representations.In this section, we diversify the construction of positive pairs from two perspectives.Firstly, we augment the original sentences to generate positive examples by sampling intermediate words and replacing entity pairs.Secondly, we extract cross-sentence positive pairs by utilizing an OpenIE model to extract relation templates of the sentences and employing NLI to discover mutually entailed templates.

Within-Sentence Pairs Augmentation
Augmented positive pairs from the same sentences are important to retain the local smoothness of the contrastive learning model since the original sentence and its perturbation have similar representations.To increase the diversity of relation feature vectors generated from the same sentence x and thus obtain more positive examples, we perform a random sample on the sentence encoding h portion of the vector h v (Liu et al., 2022).Specifically, we use uniform sampling.The sentences are encoded first, and the sampling is conducted on the encodings of context words to achieve the variations of the context encoding.To moderately retain relation context information, during the random sample, we only sample the non-stop words between the target entity pair (if there are less than m intermediate non-stop words, sample the remaining words from other parts of the sentence).Let's denote the positions of m non-duplicated randomly sampled non-stop words as p 1 , ..., p m , the fixed-length relation feature vector we employ is: By sampling h v twice, we can obtain h r and another relation feature vector ) where < h r , h ′ r > form a positive pair.Because h r and h ′ r contain the same entity pair, the representation learning may only rely on entity names and ignore the context information, limiting the representations' generalizability.To mitigate this issue and encourage the model to focus more on relation information, we propose to augment entity pairs.Because we only modify the head and tail entities, the relational context of the original sentence x and the synthetic sentence x ′′ remain the same.These two sentences should have the same relation and the corresponding relation feature vectors should be a positive pair.
During the entity pair augmentation, the original e h and e t are replaced with another pair of entities e ′′ h and e ′′ t , where e h and e ′′ h have the same type, as do e t and e ′′ t , respectively.This synthetic sentence x ′′ derives relation feature vector h ′′ r : where < h r , h ′′ r > form a positive pair.

Augmentation through Cross-Sentence Pairs Extraction
Leveraging the original sentences and their perturbations, the learning process constructs an embedding space that primarily focuses on preserving the local smoothness around each instance while largely ignoring the diverse global information in the dataset (Li et al., 2021).Thus, relying solely on the instance-wise augmentations in section 4.2.1 is insufficient, and extracting additional high-quality cross-sentence positive pairs is essential.We rely upon the following assumption as a basis for extracting cross-sentence positive pairs: if the target entity pairs of two sentences have the same relation textual pattern (which we call a relation template), these two entity pairs share the same underlying relation, and the corresponding relation feature vectors can be regarded as a positive pair.
To extract relation templates from the corpus without requiring any human input, we first employ an Open Information Extraction (OpenIE) (Banko et al., 2007) tool to transform the sentences into structured formats.The OpenIE model receives the raw text of the sentence as input and outputs several (subject, predicate, object) triples.Specifically, we utilize the Stanford OpenIE model (Angeli et al., 2015).For each sentence, we keep one valid triple if the predicate contains at least one non-stop word and the subject and object exactly match the given head and tail entities, respectively.With all valid extracted triples from the corpus, we count the frequency of distinct predicates and form high-quality relation templates to be those predicates whose frequency is greater than a threshold t.Finally, we enumerate the relation feature vectors of all pairs of sentences that share the same relation template to be positive pairs.
Extracting same-template sentence pairs can capture global information and diversify positive pairs, but it may be strict since different relation templates can express the same semantics meaning.For instance, "is CEO of" and "is the chief executive officer of" are two distinct relation templates but convey the same relation.
Inspired by the recent achievements of Natural Language Inference (NLI) to obtain indirect supervision in RE tasks (Sainz et al., 2021;Zhou et al., 2022), we further propose to employ an off-theshelf NLI model (Zhuang et al., 2021) to discover templates that have the same semantic relations.NLI, also known as Textual Entailment, aims to determine whether a "hypothesis" sentence is true (entailment), false (contradiction), or undetermined (neutral) given a "premise" sentence.In this task, we aim to discover relation templates tp 1 and tp 2 , both of which are verb phrases, that are mutually entailed using NLI.
To apply NLI model, we need to construct hypothesis sentences and premise sentences using relation templates and their covered sentences, respectively.Let's denote the set of sentences whose relation template is tp 1 as S 1 .To eliminate the disturbance of the specific named entities, we generate a new set S ′ 1 as premises by replacing the head and tail entities of sentences in S 1 with special tokens [h] and [t], respectively.We then synthesize a sentence hypo 2 in the form "[h] tp 2 [t]" as a hypothesis, which is composed of the relation template tp 2 and the two special tokens [h] and [t].We call tp 1 entails tp 2 if for a certain ratio r of sentences in S ′ 1 entails the synthetic sentence hypo 2 .NLI predicts a directional relation between two templates.To find mutually entailed relation templates, for each pair of relation templates, we conduct the above operations from both directions.Finally, the relation feature vectors of all pairs of sentences whose relation templates are mutually entailed are enumerated to be positive pairs.The recent development of ChatGPT and its summarization power provide the possibility of capturing higher-level relations of the sentences, which can further assist in discovering more crosssentence positive pairs.We design a simple prompt "Given the context x, what is the relationship between e h and e t (as short as possible)?" to rewrite the original sentence x with the target entities e h and e t .Given the prompt of sentence x, ChatGPT will generate an output and this output is denoted as the rewritten sentence x c .Note that the rewritten sentence x c is semantically similar to the original sentence x but may not have the same level of granularity in terms of the relation as x.For example, an original sentence "Paul ... is a senior research associate ... Library" can be rewritten into "Paul ... works for ... Library".
To not disrupt the original corpus's data distribution, we will not use the rewritten sentence positive pairs directly in training.Instead, we repeat the above template extraction and NLI template grouping process on the rewritten sentences and then identify sentences from the same template groups.
The corresponding original sentence pairs can be used as additional positive pairs.Note that the ChatGPT-discovered cross-sentence positive pairs can be numerous and potentially contain a higher level of noise.To address this, we employ random sampling to select a subset of these pairs before incorporating them into the training process.

Loss Function
For simplicity, let P w denote the set of withinsentence positive pairs, P c denote the set of crosssentence positive pairs, and P P air = P w ∪ P c = [(a r 1 , p r 1 ), ..., (a r M , p r M )] be the set of relation vectors of M positive pairs from the training corpus, where (a r i , p r i ) denotes the pair of relation vectors of the i-th positive sample.Liu et al. (2022) randomly sample sentences from the same minibatch as the negative samples used by the NCE loss.However, this approach may be accompanied by potential issues.When the total number of relational clusters is small, employing this approach is likely to yield a substantial number of false negatives.Furthermore, relation semantics should not be regarded as "same/different" in many cases but rather as a similarity spectrum.When considering two pairs of semantically distinct relations, it is possible for one pair to exhibit a higher degree of semantic proximity compared to the other pair.Hence, we believe the NCE loss function is not suited for relation representation learning.
We propose to apply margin loss for sentence pairs.Margin loss is more robust to noise in the training data, especially the false negative issues.Moreover, it captures the spectrum property of the differences between positive and negative pairs instead of treating them solely as binary distinctions.
The loss function for the set of sentence pairs P P air is defined as: where dist(•) represents the cosine distance function, n r i is a randomly sampled negative example for a r i , and the parameter γ, referred to as the margin, is a hyperparameter.
For the K-Means exemplar, our objective is to encourage sentences belonging to the current cluster to be closer to their centroid, while simultaneously pushing sentences outside the cluster away from the current centroid.Since a sentence either belongs to a cluster or not, this is a binary relation and we use the same Exem loss as Liu et al. (2022): ) where j ∈ [1, c l ] represents the j-th cluster at granularity layer l, e l j is relation feature vector of the exemplar of instance i at layer l, and τ is a is a temperature hyperparameter (Wu et al., 2018).
Our overall loss function is defined as the addition of Pair loss L P air and Exem loss L Exem : During inference, we encode each sentence using the trained model to obtain its representation.Subsequently, the K-Means clustering algorithm is applied to cluster the relation representations, enabling us to predict the relation for each sentence based on the clustering results.

Datasets
Following Liu et al. (2022), we adopt NYT-FB2 and TACRED3 (Zhang et al., 2017) datasets to train and evaluate our model.NYT-FB is a dataset generated via distant supervision that combines information from two sources: The New York Times (NYT) articles and the Freebase knowledge graph (Bollacker et al., 2008).We filter out sentences with non-binary relations the same as Hu et al. (2020), Tran et al. (2020), and Liu et al. (2022). Tran et al. (2020) raise two additional concerns that emerge when employing the NYT-FB dataset.The development and test sets of the NYT-FB dataset consist entirely of aligned sentences without any human curation and include tremendous noisy labels.In particular, about 40 out of 100 randomly sampled sentences were given incorrect relation labels.Also, the development and test sets are part of the training set, making it challenging to accurately evaluate a model's performance on unseen sentences, even under the URE setting.Hence, supplementary evaluations are carried out on the TACRED dataset, which is a large-scale supervised relation extraction dataset built via crowdsourcing.By conducting additional evaluations on the TA-CRED dataset, we can attain a more precise comprehension of the performance of all the models.

Baselines
We compare the proposed model with the following baseline methods: 1) EType (Tran et al., 2020) is a straightforward relation classifier that employs a one-layer feed-forward network.It uses entity type combinations as input to infer relation types.
2) SelfORE (Hu et al., 2020) leverages weak, selfsupervised signals from BERT for adaptive clustering and bootstraps these signals by improving contextualized features in relation classification.3) MORE (Wang et al., 2021) utilizes deep metric learning to extract rich supervision signals from labeled data and drive the neural model to learn relational representations directly.4) OHRE (Zhang et al., 2021) effectively integrates hierarchy information into relation representations via a dynamic hierarchical triplet objective and hierarchical curriculum training paradigm.5) EIURE (Liu et al., 2021) intervenes on both the entities and context to uncover the underlying causal effects of them.6) HiURE (Liu et al., 2022) is the state-of-the-art method that derives hierarchical signals from relational feature space and effectively optimizes relation representations through exemplar-wise contrastive learning.To ensure fair comparisons, we use the HiURE results with the K-Means clustering algorithm as reported in the original paper.

Evaluation Metrics
Following previous work, we use B 3 (Bagga and Baldwin, 1998), V-measure (Rosenberg and Hirschberg, 2007), and Adjusted Rand Index (ARI) (Hubert and Arabie, 1985) to evaluate the effective-ness of different methods.For all these metrics, the higher the better.
B 3 precision and recall measure the quality and coverage of relation clustering, respectively.B 3 F1 score is computed to provide a balanced clustering performance evaluation.
V-measure is another evaluation metric commonly used for assessing clustering quality.While B 3 treats each sentence individually, V-measure considers both the intra-cluster homogeneity and inter-cluster completeness, providing a more comprehensive evaluation of clustering performance by considering the entire clustering structure.
Adjusted Rand Index (ARI) measures the agreement degree between the clusters assigned by a model and the ground truth clusters.It ranges in [−1, 1], where a value close to 1 indicates strong agreement, 0 represents a random clustering, and negative values indicate disagreement.

Setup
To primarily show the enhancement of relation representations, we choose the simple K-Means as the downstream clustering algorithm.Following previous work (Simon et al., 2019;Hu et al., 2020;Liu et al., 2021Liu et al., , 2022)), the number of clusters k is set to 10.Despite the fact that the ground truth relations have more than 10 classes, this choice still provides valuable insights into the models' capability to handle skewed distributions.Moreover, this enables us to conduct fair comparisons with the baseline results.More implementation details can be found in Appendix A.

Main Results
We conduct experiments on both NYT-FB and TA-CRED datasets, with the average performance and standard deviation of three random runs reported.The main results are shown in Table 1.We observe that the proposed AugURE outperforms all baseline models consistently on B 3 F1, V-measure F1, and ARI metrics.On the NYT-FB dataset, the proposed AugURE on average improves the B 3 F1 by 0.6%, V-measure F1 by 4.3%, and ARI by 2.8% compared with the runner-up results; on the TACRED dataset, the improvements are more evident, with the B 3 F1, V-measure F1, and ARI increased by 2.4%, 1.9%, 12.4%, respectively compared with the runner-up results.

Ablation Study
To investigate the contribution of different components, we further conduct an ablation study by systematically excluding or replacing specific components, and the results are shown in Table 2.

Effectiveness of Within-Sentence Pairs Augmentation
To assess the impact of the within-sentence pairs augmentation, two experiments are conducted: 1) removing entity-augmented positive pairs from the training set (referred to as "− Entity Augmentation"); 2) removing the entire set of within-sentence positive pairs P w from the training set (referred to as "− Within-Sentence Pairs").The results reveal that removing entity-augmented positive pairs leads to notable decreases in all metrics.This indicates that without entity augmentation, the learned relation representations may rely more on matching entity names instead of context information, posing limitations on generalizability.
It is interesting that removing the set P w on the TACRED dataset leads to a performance decrease but does not show apparent influences on the NYT-FB dataset.These findings show that the proposed AugURE can learn sufficient information from the diverse cross-sentence pairs, highlighting their crucial role in enhancing the model's performance.

Effectiveness of Augmentation through
Cross-Sentence Pairs To evaluate the impact of augmentation through cross-sentence pairs, two experiments are conducted.Initially, we exclude positive pairs extracted based on ChatGPT's guidance to examine their impact (referred to as "− ChatGPT Pairs").Subsequently, the entire set of cross-sentence positive pairs P c is excluded (referred to as "− Cross-Sentence Pairs").The results demonstrate that the impact of ChatGPT can vary across different datasets, indicating a dataset-dependent nature of its influence.This observation is reasonable since the granularity of relations across different datasets can differ, resulting in varying gaps between the original sentences and the rewritten sentences.
Excluding the set P c leads to significant declines in overall performance across all metrics on both datasets.This finding highlights the effectiveness of the proposed augmentation through cross-sentence pairs in enhancing diversity and attaining superior results.

Margin Loss vs. NCE Loss
To study the advantage of margin loss compared with NCE loss, we change the within-sentence positive pairs' margin loss L w in Eq. ( 10) to be the NCE loss defined in Eq. ( 1).As shown in tasks, highlighting the contribution of margin loss to the overall performance.

Evaluations of the OpenIE and NLI Performance
We also evaluate the OpenIE and NLI performance on the NYT-FB dataset.Results are summarized in Table 3 and Table 4, with further analyses presented in the subsequent paragraphs.
Regarding accuracy, among the 54457 sametemplate pairs, 73.2% of them are correct (belong to the same relation based on the ground truth labels).We randomly sampled and examined 30 of the 'incorrect' pairs and found that 90.0% of these pairs convey relationships similar enough to be regarded as positive pairs for the URE task.Among the 216 mutually entailed pairs based on NLI predictions, 63.4% of them are correct.We also randomly sampled 30 of the 'incorrect' pairs and found that 76.7% of them can be viewed as positive pairs.In terms of coverage, the total number of extracted relation templates is 1210, and each relation template, on average, consists of 2.75 sentences.Only 33.2% of the sentences are covered by the extracted templates.That is, 66.8% of the sentences do not follow any relation template.Furthermore, the total number of extracted templates is much more than the number of target clusters for the URE task.
Consequently, the OpenIE and NLI method has high precision and is suitable to be used to generate high-quality positive pairs to enhance the contrastive learning process.The OpenIE and NLI method also suffers from low recall, which means it is unsuitable to directly serve as a baseline for the URE task.

Conclusion
In conclusion, this paper offers an exploration of URE from the perspective of relation representation learning.We introduce two novel augmentation techniques: within-sentence pairs augmentation and augmentation through cross-sentence pairs extraction, both of which aim to augment positive pairs, enhancing their diversity and bolstering the discriminative capabilities of contrastive learning.We also identify the limitations of NCE loss in URE tasks and propose margin loss for sentence pairs.Experiments on two datasets show the superiority of the proposed AugURE compared to competitive baselines, highlighting the efficacy of the augmentation techniques and the significance of employing margin loss.Overall, our findings advance the understanding of relation representation learning and provide valuable insights for future research and practical applications.

Limitations Limitation of K-Means Clustering Algorithm
The proposed AugURE adopts the K-Means clus-tering algorithm, which requires a pre-defined number of relations k as input.However, determining the appropriate number can be challenging and subjective.Moreover, K-Means assumes that clusters are spherical and have a similar size.This assumption may not hold true for datasets with skewed distributions, leading to suboptimal clustering results.

Limitation of Relation Template Extraction
The inherent nature of the relation template extraction process in the proposed AugURE may exhibit a bias towards larger and more frequently occurring relation types in the presence of skewed relation distributions.This bias may result in a decrease in performance when dealing with long-tail relations.

Limitation of ChatGPT Prompt
ChatGPT is known to be sensitive to various prompts.In this work, our focus is primarily on the design of a functional ChatGPT prompt.We did not explore other strategies for sentence rewriting that may preserve the same level of granularity as the original sentence.

Ethics Statement
We comply with the EMNLP Code of Ethics.
In this work, we mainly focus on developing effective relation representations for URE tasks.For relation clustering, we adopt a simple K-Means clustering algorithm.Following the idea of Liu et al. (2022), we implement cluster centroids of different k values as different granularities of relational exemplars.These exemplars are subject to fluctuations in accordance with the parameters update of the relation encoder in each training epoch.

Table 1 :
Performance of all methods on NYT-FB and TACRED datasets.

Table 2 ,
replacing margin loss with NCE loss leads to notable decreases in all metrics on both datasets.This result implies that the NCE loss function suits poorly for relation representation learning in URE

Table 3 :
The accuracy of OpenIE and NLI generated pairs on NYT-FB dataset.

Table 4 :
The coverage of OpenIE and NLI extracted templates on NYT-FB dataset.