Great Service! Fine-grained Parsing of Implicit Arguments

Broad-coverage meaning representations in NLP mostly focus on explicitly expressed content. More importantly, the scarcity of datasets annotating diverse implicit roles limits empirical studies into their linguistic nuances. For example, in the web review “Great service!”, the provider and consumer are implicit arguments of different types. We examine an annotated corpus of fine-grained implicit arguments (Cui and Hershcovich, 2020) by carefully re-annotating it, resolving several inconsistencies. Subsequently, we present the first transition-based neural parser that can handle implicit arguments dynamically, and experiment with two different transition systems on the improved dataset. We find that certain types of implicit arguments are more difficult to parse than others and that the simpler system is more accurate in recovering implicit arguments, despite having a lower overall parsing score, attesting current reasoning limitations of NLP models. This work will facilitate a better understanding of implicit and underspecified language, by incorporating it holistically into meaning representations.


Introduction
Studies of form and meaning, the dual perspectives of the language sign, can be traced back to modern linguistics since de Saussure (1916,1978).They are relevant to modern NLP, as even current large neural language models cannot intrinsically achieve a humananalogous understanding of natural language (Žabokrtskỳ et al., 2020;Bender and Koller, 2020).
In recent years, a few studies have been dedicated to annotating and modelling implicit and underspecified language (Roesiger et al., 2018;Elazar and Goldberg, 2019;McMahan and Stone, 2020).For example, in the online review "there is no delivery", two of the omitted elements are the business agent and delivered items.However, previous works focus on specific phenomena requiring linguistic and collaborative reasoning, e.g., bridging resolution, numeric fused-heads identification and referential communication.Such datasets overexpose models to limited linguistic expressions, without leveraging the complete syntactic and semantic features of the context, regard-less of the diversity of implicit roles.
O'Gorman (2019) and Cui and Hershcovich (2020) present works on fine-grained implicit role typology, incorporating it into the meaning representation frameworks AMR and UCCA, respectively.They lay a foundation for the interpretation of idiosyncratic behaviours of implicit arguments from a linguistic and cognitive perspective.Nevertheless, neither provided a dataset ready for computational studies of such implicit arguments.
We take the latter as a starting point, addressing several theoretical inconsistencies, and evaluate its applicability by carefully re-annotating their pilot dataset and providing inter-annotator agreement.Our categorisation set, consisting of six implicit role types, is compatible with UCCA's semantic notion of Scene rather than specific linguistic phenomena.Furthermore, as opposed to previous work, it tackles only essential implicit arguments, salient in cognitive processing.
We design the first semantic parser, with two different transition systems, that has the ability to parse fine-grained implicit arguments for meaning representations and evaluate its performance on the revisited dataset.To conclude, we reflect on this work's objectives and the challenges to face in future research on implicit arguments1 .

Revisiting Implicit Argument Refinement
Universal Conceptual Cognitive Annotation (UCCA; Abend and Rappoport, 2013), a typologically-motivated meaning representation framework, has been targeted by several parsing shared tasks (Hershcovich et al., 2019b;Oepen et al., 2019Oepen et al., , 2020)).It uses directed acyclic graphs (DAG) anchored in surface tokens, where labelled edges represent semantic relations.In the foundational layer, these are based on the notion of Scenes (States and Processes), their Participants and modifiers.UCCA distinguishes primary edges, corresponding to explicit content, from remote edges, allowing cross-Scene relations.Additionally, Implicit units represent entities of importance to the interpretation of a Scene that are not explicitly anchored in the text.Several refinement layers have been proposed beyond the foundational layer, adding distinctions important for semantic roles and coreference (Shalev et al., 2019;Prange et al., 2019).

Fine-grained Implicit Argument
Refinement for UCCA Cui and Hershcovich (2020) proposed a finegrained implicit argument typology implemented as a refinement layer of Participants for UCCA, centering around the semantic notion of Scene.Their proposed categorisation set consisting of six types, listed in Table 1, is argued to have low annotation complexity and ambiguity, thus requiring relatively low cognitive load for annotation than other fine-grained implicit argument typologies (O'Gorman, 2019).
For example, the online review "Great service and awesome price!" is annotated as follows and visualised as Figure 1: There are two Scenes invoked in the sentence."Great service" with "service" as a Process, and "awesome prices" with "awesome" as a State.Who is serving, who is being served and what is priced are distinct implicit arguments (IMP), which require reasoning to resolve.
Genre-based roles refer to conventional omission in the genre (Ruppenhofer and Michaelis, 2010).The corpus is based on online reviews, where reviewers typically do not mention what is under review.Therefore, an implicit argument in each Scene is marked as Genre-based referring to the reviewee.Generic roles denote "people in general" (Lambrecht and Lemoine, 2005).In the example, the recipient could be anyone, rather than a specific person.

Revisiting Inconsistencies
Despite the general soundness of Cui and Hershcovich (2020)'s typology, we find multiple inconsistencies in it: prominently, the treatment of Process and State Scenes and of nominalisation, and some borderline cases.We propose to revisit these cases and introduce consistent implicit argument refinement guidelines.

State Scenes and Process Scenes
UCCA differentiates Scenes according to their persistency in time (Abend and Rappoport, 2013): stative Scenes are temporally persistent states, while processual Scenes are evolving events.Cui and Hershcovich (2020) did not annotate implicit arguments in State Scenes, although they are essentially similar.We, therefore, categorise them using the same guidelines.This phenomenon is rather pervasive in the dataset, as in Example 1 and 2: (2)

The Definition of Prominent Elements
While UCCA only annotates implicit units when they are "prominent in the interpretation of the Scene" (Abend and Rappoport, 2013), it is not always clear what should be regarded as such: the level of uniqueness plays an essential role in recognising the prominence of an argument.For example, even distinguishing definite and indefinite articles may pose a challenge to semantic analysis (Carlson and Sussman, 2005).There are some linguistic tests for testing whether an argument is semantically mandatory (Goldberg, 2001;Hajič et al., 2012).As for UCCA, we posit that time, location and instrument modifiers, as a rule, are ubiquitous to such an extent that no implicit argument should be annotated unless a Process or State warrants it.In Example 3 (visualised as the upper graph in Figure 2a), in the Scene "you leave," the departing action demands that the implicit source location is vital for understanding, unlike the location element in Example 1 or 2, where they are of low prominence. (

Nominalisation as Agent Nouns
An Agent noun derives from another word denoting an action, and that identifies an entity that perform that action.In UCCA, such an expression is annotated as a single unit with both the Process and Participant categories.As in Example 3, "a real mechanic" is marked P/A.Like time or location, it is subjective whether and how many implicit arguments should be annotated given the possible list of involved receivers, instruments, etc.For a profession like "mechanic", whatever is being repaired, as well as the repair tools, can be omitted.While an agent noun can invoke a Scene, they could also simply serve as a title.Such cases, e.g."doctor," "professor" and "chairman," are common.To facilitate consistent annotation, we decide never to annotate implicit Participants in Scenes licensed by agent nouns.This means that there is just one Participant, the person themselves.

Indefinite Deictic Participant
Deictic arguments refer to the implicit speaker or addressee.They may be confused with Generic roles when it is ambiguous whether the speaker or addressee participates in the Scene.As a rule, we posit that an implicit Participant ought to be Generic unless it evidently refers to the speaker or addressee.Example 4 shows an ambiguous case.

Fine-grained Implicit Argument Corpus
As a pilot annotation experiment, Cui and Hershcovich (2020) reviewed and refined 116 randomly selected passages from an UCCA-annotated dataset of web reviews (UCCA EWT; Hershcovich et al., 2019a).The dataset was annotated by only one annotator.With our revisited guidelines, we ask two annotators2 to revisit the original dataset, adding or modifying implicit arguments, and subsequently refining their categories, using UCCAApp (Abend et al., 2017).

Evaluation of Implicit Argument
Annotation Standard UCCA evaluation compares two graphs (e.g., created by different annotators, or one being the gold annotation and the other predicted by a parser), providing an F1 score by matching the edges by their terminal span and category (Hershcovich et al., 2019b).However, the standard evaluation completely ignores implicit argument annotation.To quantify inter-annotator agreement and later parser performance ( §6), we provide an evaluation metric taking these units into account.
To compare a graph G 1 with implicit units I 1 to a graph G 2 with implicit units I 2 over the same sequence of terminals W = w 1 , . . ., w n , for each  implicit node i, we identify its parent node p(i), denoting the set of terminals spanned by it as the yield y(p(i)) ⊆ W , and its category as the label ℓ(p(i), i).Define the set of mutual implicit units between G 1 and G 2 : The F-score is the harmonic mean of labelled precision and recall, defined by dividing In labelled evaluation, it is worth noting that we require a full match of the two sets of labels/categories rather than just intersection, which suffices in standard UCCA evaluation (of nonimplicit units).That is, if there are two or more implicit nodes under one parent, it is only considered a correct match when both the numbers of implicit nodes and their labels are equal.We also introduce unlabelled evaluation, which only requires that parents' spans match.
For example, in Figure 2a, the reference graph has two implicit arguments.The first implicit unit's parent spans {have, a, real, mechanic, check} and the second's {you, leave}.In the predicted graph, two implicit units are predicted.The first implicit unit's parent spans {a, real, me-chanic} while the second one spans the same terminals as the reference graph.We can see that the spans of the first implicit unit do not match-the Deictic Generic Genre-based Type-identifiable Non-specific Iterated-set Total Implicit Arguments Cui and Hershcovich (2020)  second matches but with the wrong label.Therefore, in labelled and unlabelled implicit unit evaluation, the precision, recall, and F1 scores are all 0 and 0.5, respectively.In Figure 2b, the reference graph has three implicit units, labelled Non-specific, Generic and Genre-based.One spans {The, service, is, over-rated} while two span {The, service}.Although the predicted graph has two implicit units with correct labels (Non-specific and Generic), it misses both implicit units under the second parent span, viz. the two sets merely intersect but are not equal.Therefore, the triplet precision, recall and F1 score for labelled evaluation are all 0.5 and the triplet for unlabelled evaluation are 1.

Inter-annotator Agreement
The two annotators separately reviewed and refined 15 out of the 116 passages (taken from the original test split).Using the evaluation metric proposed in §3.1, the labelled and unlabelled F1 scores are 73.8% and 91.3% respectively (see Appendix A).Annotators have a Cohen's κ (Cohen, 1960) of 69.3% on the six-type fine-grained classification of the implicit arguments whose parents' spans match.For comparison, on the FiGref scheme, O'Gorman (2019) report a Cohen's κ of 55.2% on a 14-way classification and of 58.1% on a four-way classification.Gerber and Chai (2010) proposed a relevant task to annotate implicit arguments for instances of nominal predicates in sentences, which has a Cohen's κ of 67%.While we maintain a comprehensive fine-grained typology, we still see an improvement in agreement over other corpora.

Statistics of Revisited Implicit Corpus
Finally, one annotator reviewed and refined all 116 passages.The second reviewed their annotation after completion.The full revisited dataset contains 393 passages, 3700 tokens and 5475 nodes.Table 2 compares our revisited dataset to the unreviewed dataset of Cui and Hershcovich (2020).
We see a major decline in Non-specific and Type-identifiable implicit arguments, because of the clearer definition of prominent elements and cases of agent nouns.Deictic, Generic and Genre-based increase their amount thanks to incorporating implicit arguments in State Scenes rather than only Process Scenes.The number of Iterated-set remains small due to the rareness of aspectual morphology and habitual/iterative constructions in English and in the corpus.However, it is still necessary to keep the category and separate out its instances rather than lump into another category or even ignore them.Since implicit arguments of such kind could be more common in morphologically rich languages, we want to keep a clean mapping of habitual/iterative constructions so as to facilitate the studies of implicit roles' diverse behaviours in languages other than English.

Two Transition Systems for Parsing Implicit Arguments
We build the first neural parser that supports parsing implicit arguments dynamically in meaning representations, with two different transition systems.We design a transition-based parser, modelled upon Nivre (2003): a stack S = (. . ., s 1 , s 0 ) holds processed words.B = (b 0 , b 1 , . ..) is a buffer containing tokens or nodes to be processed.
V is a set of nodes, and E is a set of labelled edges.We denote s 0 as the first element on S and b 0 as the first element on B. Given a sentence composed by a sequence of tokens t 1 , t 2 , ..., t n , the parser is initialized to have a Root node on S, and all surface tokens in B. The parser will at each step deterministically choose the most probable transition based on its current parsing state.Oracle action sequences are generated for training on goldstandard annotations.
We propose two transition systems, IMPLICIT-EAGER and IMPLICIT-STANDARD, to deal with implicit arguments over the architecture of HIT-SCIR 2019 (Che et al., 2019), which ranked first in UCCA parsing in the MRP 2019 shared task (Oepen et al., 2019).The transition system incorporates all nine transitions, namely, LEFT-EDGE, RIGHT-EDGE, SHIFT, REDUCE, NODE, SWAP, LEFT-REMOTE, RIGHT-REMOTE and FINISH.
SHIFT, together with REDUCE, are standard transitions.SHIFT moves b 0 to S, while REDUCE pops s 0 from S (when it should not be attached to Table 3: The transition sets of two implicit transition systems.Actions marked in red are for IMPLICIT-EAGER, blue for IMPLICIT-STANDARD.We write the stack with its top to the right and the buffer with its head to the left.(•, •) X denotes a X-labelled edge, (•, •) * X a remote X-labelled edge, and (•, •) # X an X-labelled edge to an implicit node.i(x) is a running index for the created nodes.The prospective child of the EDGE action cannot have a primary parent.The newly generated node by IMPLICIT X action is prohibited from having any descendant.NODE generates a concept node on the buffer, but deos not produce an arc.This table is adapted from Hershcovich et al. (2017).
any element in B).
Following transition-based constituent parsing, NODE X creates a new non-terminal node (Sagae and Lavie, 2005).Such node will be created on the buffer, as a parent of s 0 with an Xlabelled edge.
LEFT-EDGE X and RIGHT-EDGE X add an Xlabelled primary edge between the first two elements on S. When the first element is the parent of the second element on S, LEFT-EDGE X is executed; in reverse, RIGHT-EDGE X will be chosen when the second element has the first element as its child.The left/right direction is the same as where the arc points to.LEFT-REMOTE X and RIGHT-REMOTE X are similar to LEFT-EDGE X and RIGHT-EDGE X , yet these two transitions create remote edges, creating reentrencies.The Xlabelled edge will be assigned a Remote attribute.
SWAP deals with non-planar graphs (a generalisation of non-projective trees), in other words, discontinuous constituents.It pops the second node on S and adds it to the top of B. FINISH is the terminal transition, which pops the Root node and marks the transition state as terminal.

Experiments
In IMPLICIT-EAGER, we introduce a new transition IMPLICIT X adding an implicit node to the buffer and attaching it with a labelled edge in one step.In IMPLICIT-STANDARD, we simplify the existing NODE transition to only create a node without attaching it, with the purpose of treating implicit units like primary ones and generating them dynamically.We elaborate their designs in Section 5.1 and Section 5.2.Table 3 shows the transition set.

IMPLICIT-EAGER
Besides the nine transitions described above, IMPLICIT-EAGER introduces the IMPLICIT X transition, which creates a new unit on B as the child of s 0 , with an X-labelled edge.The IM-PLICIT action is different from the NODE action of IMPLICIT-STANDARD in the sense that the integrally generated edge makes the new node a child of s 0 rather than its parent, as in NODE X .Equally importantly, the new node is prohibited to have any child in contrast to the primary nodes that the NODE X action generates.

IMPLICIT-STANDARD
IMPLICIT-STANDARD adopts a more modular approach.Rather than complicating the transition systems, it treats primary non-terminal nodes and implicit nodes equally by simplifying the NODE X action, making it generate a new unit on the buffer without attaching it with any (labelled) edge.We assume primary non-terminal nodes and implicit nodes are identical in essence, thus handling them without discrimination.Whenever an ungenerated child or parent of s 0 is found, NODE is executed so that a concept node will be created on B. This action does not cope with edge generation; the work is left to LEFT-EDGE X or RIGHT-EDGE X .In the oracle, we can tell whether the node is primary or implicit by observing its relations.created node is the child of s 0 and does not have any descendants, it is an implicit node; otherwise, a primary node.

Data Preprocessing
We convert UCCA XML data to MRP format using the open-source mtool software. 3As the UCCA data provided in MRP 2019 shared task did not contain implicit information, HIT-SCIR 2019 is not designed to read this information in our dataset.We modify the parser to read node properties, and to convert UCCA data from and to MRP format.The updated version of mtool is available on GitHub.4

Experimental Setup
Our parsers use stack LSTM to stabilize gradient descent process and speed up training; we enrich contextual information by employing the pretrained language model BERT as a feature input (Graves, 2013;Devlin et al., 2018).The model is implemented in AllenNLP (Gardner et al., 2018).We use the HIT-SCIR 2019 parser as the baseline for comparison.We keep the same hyperparameters as Che et al. (2019) except batch size, adjusted from 8 to 4 due to resource constraints.We do not tune hyperparameters on either the original or revisited dataset. 5e use the train, validation and evaluation split from Hershcovich et al. (2019b), which was originally from UD EWT, with the ratio of 0.75, 0.125 and 0.125.The evaluation set has been validated as the gold standard.Table 4 shows detailed statistics of train, dev and eval set of both the original and revisited dataset, on which we trained the baseline parser, IMPLICIT-EAGER and IMPLICIT-STANDARD.6

Results
Table 5 presents experimental results on Revisited Implicit EWT by three parsers, the baseline HIT-SCIR 2019 parser, IMPLICIT-EAGER and IMPLICIT-STANDARD on Revisited Implicit EWT.Regarding performance on the dataset, the baseline is not able to predict implicit argument as expected.However, both IMPLICIT-EAGER and IMPLICIT-STANDARD managed to predict implicit arguments.
Based on the evaluation method mentioned in section 3.1, IMPLICIT-EAGER's labelled precision and labelled recall on Revisited Implicit EWT are 0.333 and 0.14; the unlabelled precision and unlabelled recall are 0.428 and 0.18.For the primary edge and remote edge evaluation, noticeably, IMPLICIT-EAGER also outperforms the baseline on primary edges by 0.007 in F-score on the revisited dataset.
Even though IMPLICIT-STANDARD has the worse results in terms of primary parsing, it gains boosted performance on all targets in implicit evaluation.Its unlabelled implicit precision, recall and F-score are 0.5, 0.22 and 0.306, defeating IMPLICIT-EAGER by 0.072, 0.04 and 0.052, respectively.
Table 6 presents the three parsers' performances on Original EWT.The baseline produced better results on primary edges and remote edges on Original EWT.
The reason why IMPLICIT-STANDARD outperforms IMPLICIT-EAGER on implicit evaluation but decrease in accuracy on primary evaluation might be attributed to its equal treatment of primary nodes and implicit nodes.

Discussion
As is indicated in Table 5, IMPLICIT-EAGER and IMPLICIT-STANDARD successfully predicted respectively seven and nine implicit arguments with the correct fine-grained implicit labels.In the unlabelled evaluation, nine and 11 implicit arguments were predicted each.
Table 7 shows the confusion matrix of the performances of IMPLICIT-EAGER and IMPLICIT-STANDARD on the evaluation set of Revisited Implicit EWT.Both parsers have predicted roughly the same amount of implicit arguments, 22 and 21, respectively.

IMPLICIT-EAGER
Table 7: Confusion matrix on the Revisited Implicit EWT evaluation set: The column is the predicted labels while the row is the actual labels.Noticeably, the parsers are able to predict implicit elements of other categories in theory, such as Process (P).If not clarified otherwise, the fine-grained implicit categories are Participant by default.
Both implicit parsers have predicted correctly four Deictic and three Generic & Genre-based implicit arguments.Besides, IMPLICIT-STANDARD managed to predict two more correct Genre-based while IMPLICIT-EAGER has never predicted successfully stand-alone Genre-based implicit argument.IMPLICIT-STANDARD has a remarkably higher labelled precision of 66.7% in Deictic than IMPLICIT-EAGER of 33.3%, while the latter has a higher precision of 75% in Generic & Genre-based than IMPLICIT-STANDARD of 50%.Lamentably, neither of the implicit parsers has emitted prediction for Type-identifiable nor Iterated-set.It is necessarily expected as both categories have less than five instances in the training set.
Although this paper focuses on fine-grained implicit Participants, there are already some implicit arguments of other categories annotated in the foundational layer, especially Process and Center.Interestingly, in the sentence "Fresh and excellent quality," IMPLICIT-STANDARD generated two implicit arguments, Genre-based and Process in the Scene "Fresh."It means not only that it infers the Scene misses a Genre-based Participant, but also a Process, i.e., the main relation that evolves in time.

Related Work
Parsing implicit argument was introduced into NLP by Ruppenhofer et al. (2009); Gerber andChai (2010, 2012), but has been coarse-grained and annotated within Nombank (Meyers et al., 2004), limited to ten nominal predicates.Bender et al. (2011) identified ten relevant linguistic phenomena, ran several parsers and associated their output with target dependencies.
Roth and Frank (2015) used a rulebased method for identifying implicit arguments, which depends on semantic role labelling and coreference resolution.Similarly, Silberer and Frank (2012); Chiarcos and Schenk (2015); Schenk and Chiarcos (2016) proposed parsing methods for SemEval 2010 data, but they are only able to parse implicit arguments on a coarse level as well.Cheng and Erk (2018) built a narrative cloze model and evaluated it on Gerber and Chai (2010)'s dataset.

Conclusion
Implicit arguments are pervasive in the text but have not been well studied from a general perspective in NLP.In this work, we revisited a recently proposed fine-grained implicit argument typology by addressing its current deficiencies.We annotated a corpus based on the revised guidelines and designed an evaluation metric for measuring implicit argument parsing performance, demonstrating the annotation's reliability with a superior inter-annotator agreement comparing to other finegrained implicit argument studies.The dataset will be available to facilitate relevant research.
We introduced the first semantic parser, with two different transition systems, that can handle and predict implicit nodes dynamically, and label them with promising accuracy as part of the meaning representations.We evaluated it on the new dataset and found that some types of implicit arguments are harder to parse than others and that a simpler transition system performs better on parsing implicit arguments at the cost of primary parsing.The fine-grained implicit argument task is challenging and calls for further research.
In future work, we plan to create a large resource of implicit arguments by automatically ex-tracting them from various linguistic constructions in unlabelled text, use it for pre-training of language models, and evaluate them on our and other datasets to gain more insights into this linguistic phenomenon.A post-processing baseline to find implicit arguments after parsing the whole graph would also be interesting for future investigation.

Appendices A Inter-annotator Confusion Matrix
Tabel 8 shows the confusion matrix for measuring inter-annotator agreement on the evaluation set.The unlabelled F-score is 91.3%, and the labelled F1-score is 73.8%.The Cohen's κ between two annotators is 69.3%.

C Training Details
As Table 10

Figure 1 :
Figure 1: Example of a UCCA graph with fine-grained implicit arguments.Participant A Linker L Center C Connector N Adverbial D Process P Elaborator E Quantifier Q Function F Relator R Ground G State S Parellel Scene H Time T Deictic Generic Genre-based Type-identifiable Non-specific Iterated-set In the predicted graph, a mismatched and a mislabelled implicit node are shown in red.IMP N on−specif ic In the gold graph, one implicit node is marked in blue, indicating it is not matched in the predicted graph.

Table 1 :
UCCA Foundational Layer categories (above) and Implicit Participant Refinement Layer categories (below).

Table 2 :
Statistics of Revisited Implicit Corpus compared to the pilot annotation.

Table 4 :
If the newly Statistics of train, dev and evaluation set in Original EWT and Revisited Implicit EWT.For each set, number of sentences, number of tokens, number of nodes, number of instances of 6 implicit categories and their sum are listed.

Table 5 :
Experiment results on Revisited Implicit EWT in percents.For primary edges, remote edges, and implicit prediction, listed are Labelled Precision(LP), Labelled Recall (LR) and Labelled F-score (LF).In addition, Unlabelled precision (UL), Unlabelled Recall (UR) and Unlabelled F-score are also listed for implicit evaluation.

Table 6 :
Experiment results on Original EWT in percents.As there is no implicit argument in the dataset, only performances on primary edges and remote edges are listed.

Table 8 :
Confusion matrix of the evaluation set for measuring inter-annotator agreementB Hyperparameter SettingsOur parsers use stack LSTM to stabilize gradient descent process and speed up training; we enrich contextual information by employing the pre-trained language model BERT as a feature input.We keep the same hyperparameter setting for all three parsers, the baseline parser HIT-SCIR 2019, IMPLICIT-EAGER and IMPLICIT-STANDARD.The setting is shown as the Table9.

Table 10 :
shows, the training time is 2 days 22 hours for the baseline on Original UCCA EWT (50 epochs).Best epoch is 3rd; 3 hours for the baseline on Revisted Implicit EWT (30 epochs).Best epoch is 22nd; 1 day 8 hours and 1 day 19 hours for IMPLICIT-EAGER and IMPLICIT-STANDARD on Original UCCA EWT (10 epochs, 13 epochs), respectively, with the best epoch being the 3rd and 2nd; And finally, 6 hours and 8 hours for IMPLICIT-EAGER and IMPLICIT-STANDARD on Revisited Implicit EWT (50 epochs).The best epoch is 21st and 37th, respectively.One can see that all parsers achieved the best performance at an early stage on Original UCCA EWT.However, both implicit parsers took longer time to train on Original UCCA EWT than the baseline.Training details of the baseline, IMPLICIT-EAGER and IMPLICIT-STANDARD on orignial UCCA EWT and Revisited Implicit EWT., including training times, the number of best epoch and total epochs.