Daivik Swarup


pdf bib
An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text
Daivik Swarup | Ahsaas Bajaj | Sheshera Mysore | Tim O’Gorman | Rajarshi Das | Andrew McCallum
Findings of the Association for Computational Linguistics: EMNLP 2020

In specific domains, such as procedural scientific text, human labeled data for shallow semantic parsing is especially limited and expensive to create. Fortunately, such specific domains often use rather formulaic writing, such that the different ways of expressing relations in a small number of grammatically similar labeled sentences may provide high coverage of semantic structures in the corpus, through an appropriately rich similarity metric. In light of this opportunity, this paper explores an instance-based approach to the relation prediction sub-task within shallow semantic parsing, in which semantic labels from structurally similar sentences in the training set are copied to test sentences. Candidate similar sentences are retrieved using SciBERT embeddings. For labels where it is possible to copy from a similar sentence we employ an instance level copy network, when this is not possible, a globally shared parametric model is employed. Experiments show our approach outperforms both baseline and prior methods by 0.75 to 3 F1 absolute in the Wet Lab Protocol Corpus and 1 F1 absolute in the Materials Science Procedural Text Corpus.