John Judge


2022

pdf bib
Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization
Alex Mei | Anisha Kabir | Rukmini Bapat | John Judge | Tony Sun | William Yang Wang
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Neural text summarization has shown great potential in recent years. However, current state-of-the-art summarization models are limited by their maximum input length, posing a challenge to summarizing longer texts comprehensively. As part of a layered summarization architecture, we introduce PureText, a simple yet effective pre-processing layer that removes low- quality sentences in articles to improve existing summarization models. When evaluated on popular datasets like WikiHow and Reddit TIFU, we show up to 3.84 and 8.57 point ROUGE-1 absolute improvement on the full test set and the long article subset, respectively, for state-of-the-art summarization models such as BertSum and BART. Our approach provides downstream models with higher-quality sentences for summarization, improving overall model performance, especially on long text articles.

2015

pdf bib
Tapadóir
Eimear Maguire | John Judge | Teresa Lynn
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Tapadóir
Eimear Maguire | John Judge | Teresa Lynn
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

2014

pdf bib
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.

pdf bib
Proceedings of the First Celtic Language Technology Workshop
John Judge | Teresa Lynn | Monica Ward | Brian Ó Raghallaigh
Proceedings of the First Celtic Language Technology Workshop

pdf bib
Active Learning for Post-Editing Based Incrementally Retrained MT
Aswarth Abhilash Dara | Josef van Genabith | Qun Liu | John Judge | Antonio Toral
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

2008

pdf bib
Linguistically Light Lexical Extensions for Ontologies
Brian Davis | Siegfried Handschuh | Alexander Troussov | John Judge | Mikhail Sogrin
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The identification of class instances within unstructured text for either the purposes of Ontology population or semantic annotation are usually limited to term mentions of Proper Noun and Personal Noun or fixed Key Phrases within Text Analytics or Ontology based Information Extraction(OBIE) applications. These systems do not generalize to cope with compound nominal classes of multi word expressions. Computational Linguistics’ approaches involving deep analysis tend to suffer from idiomaticity and overgeneration problems while the shallower “words with spaces” approach frequently employed in Information Extraction(IE) and Industrial Text Analytics systems lacks flexibility and is prone to lexical proliferation. We outline a representation for encoding light linguistic features of Compound Nominal term mentions of Concepts within an Ontology as well as a lightweight semantic annotator which complies the above linguistic information into efficient Dictionary formats to drive large scale identification and semantic annotation of the aforementioned concepts.

2006

pdf bib
QuestionBank: Creating a Corpus of Parse-Annotated Questions
John Judge | Aoife Cahill | Josef van Genabith
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics