Claire Grover
2020
Geoparsing the historical Gazetteers of Scotland: accurately computing location in mass digitised texts
Rosa Filgueira | Claire Grover | Melissa Terras | Beatrice Alex
Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora
Rosa Filgueira | Claire Grover | Melissa Terras | Beatrice Alex
Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora
This paper describes work in progress on devising automatic and parallel methods for geoparsing large digital historical textual data by combining the strengths of three natural language processing (NLP) tools, the Edinburgh Geoparser, spaCy and defoe, and employing different tokenisation and named entity recognition (NER) techniques. We apply these tools to a large collection of nineteenth century Scottish geographical dictionaries, and describe preliminary results obtained when processing this data.
Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports
Andreas Grivas | Beatrice Alex | Claire Grover | Richard Tobin | William Whiteley
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
Andreas Grivas | Beatrice Alex | Claire Grover | Richard Tobin | William Whiteley
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
We present an in-depth comparison of three clinical information extraction (IE) systems designed to perform entity recognition and negation detection on brain imaging reports: EdIE-R, a bespoke rule-based system, and two neural network models, EdIE-BiLSTM and EdIE-BERT, both multi-task learning models with a BiLSTM and BERT encoder respectively. We compare our models both on an in-sample and an out-of-sample dataset containing mentions of stroke findings and draw on our error analysis to suggest improvements for effective annotation when building clinical NLP models for a new domain. Our analysis finds that our rule-based system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset. On the other hand, the neural models do not generalise negation to the out-of-sample dataset, despite metrics on the in-sample dataset suggesting otherwise.
2018
Up-cycling Data for Natural Language Generation
Amy Isard | Jon Oberlander | Claire Grover
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Amy Isard | Jon Oberlander | Claire Grover
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
2016
Homing in on Twitter Users: Evaluating an Enhanced Geoparser for User Profile Locations
Beatrice Alex | Clare Llewellyn | Claire Grover | Jon Oberlander | Richard Tobin
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Beatrice Alex | Clare Llewellyn | Claire Grover | Jon Oberlander | Richard Tobin
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Twitter-related studies often need to geo-locate Tweets or Twitter users, identifying their real-world geographic locations. As tweet-level geotagging remains rare, most prior work exploited tweet content, timezone and network information to inform geolocation, or else relied on off-the-shelf tools to geolocate users from location information in their user profiles. However, such user location metadata is not consistently structured, causing such tools to fail regularly, especially if a string contains multiple locations, or if locations are very fine-grained. We argue that user profile location (UPL) and tweet location need to be treated as distinct types of information from which differing inferences can be drawn. Here, we apply geoparsing to UPLs, and demonstrate how task performance can be improved by adapting our Edinburgh Geoparser, which was originally developed for processing English text. We present a detailed evaluation method and results, including inter-coder agreement. We demonstrate that the optimised geoparser can effectively extract and geo-reference multiple locations at different levels of granularity with an F1-score of around 0.90. We also illustrate how geoparsed UPLs can be exploited for international information trade studies and country-level sentiment analysis.
Improving Topic Model Clustering of Newspaper Comments for Summarisation
Clare Llewellyn | Claire Grover | Jon Oberlander
Proceedings of the ACL 2016 Student Research Workshop
Clare Llewellyn | Claire Grover | Jon Oberlander
Proceedings of the ACL 2016 Student Research Workshop
2014
Re-using an Argument Corpus to Aid in the Curation of Social Media Collections
Clare Llewellyn | Claire Grover | Jon Oberlander | Ewan Klein
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Clare Llewellyn | Claire Grover | Jon Oberlander | Ewan Klein
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This work investigates how automated methods can be used to classify social media text into argumentation types. In particular it is shown how supervised machine learning was used to annotate a Twitter dataset (London Riots) with argumentation classes. An investigation of issues arising from a natural inconsistency within social media data found that machine learning algorithms tend to over fit to the data because Twitter contains a lot of repetition in the form of retweets. It is also noted that when learning argumentation classes we must be aware that the classes will most likely be of very different sizes and this must be kept in mind when analysing the results. Encouraging results were found in adapting a model from one domain of Twitter data (London Riots) to another (OR2012). When adapting a model to another dataset the most useful feature was punctuation. It is probable that the nature of punctuation in Twitter language, the very specific use in links, indicates argumentation class.
A Gazetteer and Georeferencing for Historical English Documents
Claire Grover | Richard Tobin
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)
Claire Grover | Richard Tobin
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)
A Web-based Geo-resolution Annotation and Evaluation Tool
Beatrice Alex | Kate Byrne | Claire Grover | Richard Tobin
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop
Beatrice Alex | Kate Byrne | Claire Grover | Richard Tobin
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop
2010
Edinburgh-LTG: TempEval-2 System Description
Claire Grover | Richard Tobin | Beatrice Alex | Kate Byrne
Proceedings of the 5th International Workshop on Semantic Evaluation
Claire Grover | Richard Tobin | Beatrice Alex | Kate Byrne
Proceedings of the 5th International Workshop on Semantic Evaluation
Labelling and Spatio-Temporal Grounding of News Events
Bea Alex | Claire Grover
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media
Bea Alex | Claire Grover
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media
Agile Corpus Annotation in Practice: An Overview of Manual and Automatic Annotation of CVs
Bea Alex | Claire Grover | Rongzhou Shen | Mijail Kabadjov
Proceedings of the Fourth Linguistic Annotation Workshop
Bea Alex | Claire Grover | Rongzhou Shen | Mijail Kabadjov
Proceedings of the Fourth Linguistic Annotation Workshop
Space characters in Chinese semi-structured texts
Rongzhou Shen | Claire Grover | Ewan Klein
CIPS-SIGHAN Joint Conference on Chinese Language Processing
Rongzhou Shen | Claire Grover | Ewan Klein
CIPS-SIGHAN Joint Conference on Chinese Language Processing
2008
Learning the Species of Biomedical Named Entities from Annotated Corpora
Xinglong Wang | Claire Grover
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Xinglong Wang | Claire Grover
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In biomedical articles, terms with the same surface forms are often used to refer to different entities across a number of model organisms, in which case determining the species becomes crucial to term identification systems that ground terms to specific database identifiers. This paper describes a rule-based system that extracts species indicating words, such as human or murine, which can be used to decide the species of the nearby entity terms, and a machine-learning species disambiguation system that was developed on manually species-annotated corpora. Performance of both systems were evaluated on gold-standard datasets, where the machine-learning system yielded better overall results.
Named Entity Recognition for Digitised Historical Texts
Claire Grover | Sharon Givon | Richard Tobin | Julian Ball
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Claire Grover | Sharon Givon | Richard Tobin | Julian Ball
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We describe and evaluate a prototype system for recognising person and place names in digitised records of British parliamentary proceedings from the late 17th and early 19th centuries. The output of an OCR engine is the input for our system and we describe certain issues and errors in this data and discuss the methods we have used to overcome the problems. We describe our rule-based named entity recognition system for person and place names which is implemented using the LT-XML2 and LT-TTT2 text processing tools. We discuss the annotation of a development and testing corpus and provide results of an evaluation of our system on the test corpus.
2007
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).
Caroline Sporleder | Antal van den Bosch | Claire Grover
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).
Caroline Sporleder | Antal van den Bosch | Claire Grover
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).
Recognising Nested Named Entities in Biomedical Text
Beatrice Alex | Barry Haddow | Claire Grover
Biological, translational, and clinical language processing
Beatrice Alex | Barry Haddow | Claire Grover
Biological, translational, and clinical language processing
2006
The Impact of Annotation on the Performance of Protein Tagging in Biomedical Text
Beatrice Alex | Malvina Nissim | Claire Grover
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Beatrice Alex | Malvina Nissim | Claire Grover
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
In this paper we discuss five different corpora annotated forprotein names. We present several within- and cross-dataset proteintagging experiments showing that different annotation schemes severelyaffect the portability of statistical protein taggers. By means of adetailed error analysis we identify crucial annotation issues thatfuture annotation projects should take into careful consideration.
Rule-Based Chunking and Reusability
Claire Grover | Richard Tobin
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Claire Grover | Richard Tobin
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
In this paper we discuss a rule-based approach to chunking implemented using the LT-XML2 and LT-TTT2 tools. We describe the tools and the pipeline and grammars that have been developed for the task of chunking. We show that our rule-based approach is easy to adapt to different chunking styles and that the mark-up of further linguistic information such as nominal and verbal heads can be added to the rules at little extra cost. We evaluate our chunker against the CoNLL 2000 data and discuss discrepancies between our output and the CoNLL mark-up as well as discrepancies within the CoNLL data itself. We contrast our results with the higher scores obtained using machine learning and argue that the portability and flexibility of our approach still make it a more practical solution.
Tools to Address the Interdependence between Tokenisation and Standoff Annotation
Claire Grover | Michael Matthews | Richard Tobin
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing
Claire Grover | Michael Matthews | Richard Tobin
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing
2004
A Rhetorical Status Classifier for Legal Text Summarisation
Ben Hachey | Claire Grover
Text Summarization Branches Out
Ben Hachey | Claire Grover
Text Summarization Branches Out
The HOLJ Corpus. Supporting Summarisation of Legal Texts
Claire Grover | Ben Hachey | Ian Hughson
Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora
Claire Grover | Ben Hachey | Ian Hughson
Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora
2003
Demonstration of the CROSSMARC System
Vangelis Karkaletsis | Constantine D. Spyropoulos | Dimitris Souflis | Claire Grover | Ben Hachey | Maria Teresa Pazienza | Michele Vindigni | Emmanuel Cartier | Jose Coch
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations
Vangelis Karkaletsis | Constantine D. Spyropoulos | Dimitris Souflis | Claire Grover | Ben Hachey | Maria Teresa Pazienza | Michele Vindigni | Emmanuel Cartier | Jose Coch
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations
Summarising Legal Texts: Sentential Tense and Argumentative Roles
Claire Grover | Ben Hachey | Chris Korycinski
Proceedings of the HLT-NAACL 03 Text Summarization Workshop
Claire Grover | Ben Hachey | Chris Korycinski
Proceedings of the HLT-NAACL 03 Text Summarization Workshop
Automatic Multi-Layer Corpus Annotation for Evaluation Question Answering Methods: CBC4Kids
Jochen L. Leidner | Tiphaine Dalmas | Bonnie Webber | Johan Bos | Claire Grover
Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003
Jochen L. Leidner | Tiphaine Dalmas | Bonnie Webber | Johan Bos | Claire Grover
Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003
2002
Multilingual XML-Based Named Entity Recognition for E-Retail Domains
Claire Grover | Scott McDonald | Donnla Nic Gearailt | Vangelis Karkaletsis | Dimitra Farmakiotou | Georgios Samaritakis | Georgios Petasis | Maria Teresa Pazienza | Michele Vindigni | Frantz Vichot | Francis Wolinski
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
Claire Grover | Scott McDonald | Donnla Nic Gearailt | Vangelis Karkaletsis | Dimitra Farmakiotou | Georgios Samaritakis | Georgios Petasis | Maria Teresa Pazienza | Michele Vindigni | Frantz Vichot | Francis Wolinski
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
XML-based NLP Tools for Analysing and Annotating Medical Language
Claire Grover | Ewan Klein | Mirella Lapata | Alex Lascarides
COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002)
Claire Grover | Ewan Klein | Mirella Lapata | Alex Lascarides
COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002)
2001
XML-Based Data Preparation for Robust Deep Parsing
Claire Grover | Alex Lascarides
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
Claire Grover | Alex Lascarides
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
2000
LT TTT - A Flexible Tokenisation Tool
Claire Grover | Colin Matheson | Andrei Mikheev | Marc Moens
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
Claire Grover | Colin Matheson | Andrei Mikheev | Marc Moens
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
1999
Named Entity Recognition without Gazetteers
Andrei Mikheev | Marc Moens | Claire Grover
Ninth Conference of the European Chapter of the Association for Computational Linguistics
Andrei Mikheev | Marc Moens | Claire Grover
Ninth Conference of the European Chapter of the Association for Computational Linguistics
1998
Description of the LTG System Used for MUC-7
Andrei Mikheev | Claire Grover | Marc Moens
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998
Andrei Mikheev | Claire Grover | Marc Moens
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998
1995
Algorithms for Analysing the Temporal Structure of Discourse
Janet Hitzeman | Marc Moens | Claire Grover
Seventh Conference of the European Chapter of the Association for Computational Linguistics
Janet Hitzeman | Marc Moens | Claire Grover
Seventh Conference of the European Chapter of the Association for Computational Linguistics
1994
Priority Union and Generalization in Discourse Grammars
Claire Grover | Chris Brew | Suresh Manandhar | Marc Moens
32nd Annual Meeting of the Association for Computational Linguistics
Claire Grover | Chris Brew | Suresh Manandhar | Marc Moens
32nd Annual Meeting of the Association for Computational Linguistics
1989
The Syntactic Regularity of English Noun Phrases
Lita Taylor | Claire Grover | Ted Briscoe
Fourth Conference of the European Chapter of the Association for Computational Linguistics
Lita Taylor | Claire Grover | Ted Briscoe
Fourth Conference of the European Chapter of the Association for Computational Linguistics
1988
Software Support for Practical Grammar Development
Bran Boguraev | John Carroll | Ted Briscoe | Claire Grover
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics
Bran Boguraev | John Carroll | Ted Briscoe | Claire Grover
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics
1987
Search
Fix author
Co-authors
- Beatrice Alex 9
- Richard Tobin 8
- Marc Moens 5
- Ben Hachey 4
- Jon Oberlander 4
- Ted Briscoe 3
- Ewan Klein 3
- Clare Llewellyn 3
- Andrei Mikheev 3
- Branimir Boguraev 2
- Kate Byrne 2
- John A. Carroll 2
- Vangelis Karkaletsis 2
- Alex Lascarides 2
- Maria Teresa Pazienza 2
- Rongzhou Shen 2
- Michele Vindigni 2
- Julian Ball 1
- Johan Bos 1
- Chris Brew 1
- David Carter 1
- Emmanuel Cartier 1
- Jose Coch 1
- Tiphaine Dalmas 1
- Dimitra Farmakiotou 1
- Rosa Filgueira 1
- Donnla Nic Gearailt 1
- Sharon Givon 1
- Andreas Grivas 1
- Barry Haddow 1
- Janet Hitzeman 1
- Ian Hughson 1
- Amy Isard 1
- Mijail Kabadjov 1
- Chris Korycinski 1
- Mirella Lapata 1
- Jochen L. Leidner 1
- Suresh Manandhar 1
- Colin Matheson 1
- Michael Matthews 1
- Scott McDonald 1
- Malvina Nissim 1
- Georgios Petasis 1
- Georgios Samaritakis 1
- Dimitris Souflis 1
- Caroline Sporleder 1
- Constantine D. Spyropoulos 1
- Lita Taylor 1
- Melissa Terras 1
- Frantz Vichot 1
- Xinglong Wang 1
- Bonnie Webber 1
- William Whiteley 1
- Francis Wolinski 1
- Antal van den Bosch 1