Mark A. Greenwood

Also published as: Mark Greenwood


2016

pdf bib
GATE-Time: Extraction of Temporal Expressions and Events
Leon Derczynski | Jannik Strötgen | Diana Maynard | Mark A. Greenwood | Manuel Jung
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

GATE is a widely used open-source solution for text processing with a large user community. It contains components for several natural language processing tasks. However, temporal information extraction functionality within GATE has been rather limited so far, despite being a prerequisite for many application scenarios in the areas of natural language processing and information retrieval. This paper presents an integrated approach to temporal information processing. We take state-of-the-art tools in temporal expression and event recognition and bring them together to form an openly-available resource within the GATE infrastructure. GATE-Time provides annotation in the form of TimeML events and temporal expressions complying with this mature ISO standard for temporal semantic annotation of documents. Major advantages of GATE-Time are (i) that it relies on HeidelTime for temporal tagging, so that temporal expressions can be extracted and normalized in multiple languages and across different domains, (ii) it includes a modern, fast event recognition and classification tool, and (iii) that it can be combined with different linguistic pre-processing annotations, and is thus not bound to license restricted preprocessing components.

2014

pdf bib
Who cares about Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
Diana Maynard | Mark Greenwood
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Sarcasm is a common phenomenon in social media, and is inherently difficult to analyse, not just automatically but often for humans too. It has an important effect on sentiment, but is usually ignored in social media analysis, because it is considered too tricky to handle. While there exist a few systems which can detect sarcasm, almost no work has been carried out on studying the effect that sarcasm has on sentiment in tweets, and on incorporating this into automatic tools for sentiment analysis. We perform an analysis of the effect of sarcasm scope on the polarity of tweets, and have compiled a number of rules which enable us to improve the accuracy of sentiment analysis when sarcasm is known to be present. We consider in particular the effect of sentiment and sarcasm contained in hashtags, and have developed a hashtag tokeniser for GATE, so that sentiment and sarcasm found within hashtags can be detected more easily. According to our experiments, the hashtag tokenisation achieves 98% Precision, while the sarcasm detection achieved 91% Precision and polarity detection 80%.

2013

pdf bib
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
Kalina Bontcheva | Leon Derczynski | Adam Funk | Mark Greenwood | Diana Maynard | Niraj Aswani
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

2012

pdf bib
Large Scale Semantic Annotation, Indexing and Search at The National Archives
Diana Maynard | Mark A. Greenwood
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes a tool developed to improve access to the enormous volume of data housed at the UK's National Archives, both for the general public and for specialist researchers. The system we have developed, TNA-Search, enables a multi-paradigm search over the entire electronic archive (42TB of data in various formats). The search functionality allows queries that arbitrarily mix any combination of full-text, structural, linguistic and semantic queries. The archive is annotated and indexed with respect to a massive semantic knowledge base containing data from the LOD cloud, data.gov.uk, related TNA projects, and a large geographical database. The semantic annotation component achieves approximately 83% F-measure, which is very reasonable considering the wide range of entities and document types and the open domain. The technologies are being adopted by real users at The National Archives and will form the core of their suite of search tools, with additional in-house interfaces.

2009

pdf bib
Too Many Mammals: Improving the Diversity of Automatically Recognized Terms
Ziqi Zhang | Lei Xia | Mark A. Greenwood | José Iria
Proceedings of the International Conference RANLP-2009

2008

pdf bib
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering
Mark A. Greenwood
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

pdf bib
A Data Driven Approach to Query Expansion in Question Answering
Leon Derczynski | Jun Wang | Robert Gaizauskas | Mark A. Greenwood
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

pdf bib
Evaluation of Automatically Reformulated Questions in Question Series
Richard Shaw | Ben Solway | Robert Gaizauskas | Mark A. Greenwood
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

pdf bib
Saxon: an Extensible Multimedia Annotator
Mark Greenwood | José Iria | Fabio Ciravegna
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper introduces Saxon, a rule-based document annotator that is capable of processing and annotating several document formats and media, both within and across documents. Furthermore, Saxon is readily extensible to support other input formats due to both it’s flexible rule formalism and the modular plugin architecture of the Runes framework upon which it is built. In this paper we introduce the Saxon rule formalism through examples aimed at highlighting its power and flexibility.

2007

pdf bib
A Task-based Comparison of Information Extraction Pattern Models
Mark Greenwood | Mark Stevenson
ACL 2007 Workshop on Deep Linguistic Processing

2006

pdf bib
Proceedings of the Workshop on Information Extraction Beyond The Document
Mary Elaine Califf | Mark A. Greenwood | Mark Stevenson | Roman Yangarber
Proceedings of the Workshop on Information Extraction Beyond The Document

pdf bib
Comparing Information Extraction Pattern Models
Mark Stevenson | Mark A. Greenwood
Proceedings of the Workshop on Information Extraction Beyond The Document

pdf bib
Improving Semi-supervised Acquisition of Relation Extraction Patterns
Mark A. Greenwood | Mark Stevenson
Proceedings of the Workshop on Information Extraction Beyond The Document

2005

pdf bib
SUPPLE: A Practical Parser for Natural Language Engineering Applications
Robert Gaizauskas | Mark Hepple | Horacio Saggion | Mark A. Greenwood | Kevin Humphreys
Proceedings of the Ninth International Workshop on Parsing Technology

pdf bib
A Semantic Approach to IE Pattern Induction
Mark Stevenson | Mark Greenwood
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)