Carl Vogel
2023
The Proof is in the Pudding: Using Automated Theorem Proving to Generate Cooking Recipes
Louis Mahon | Carl Vogel
Journal for Language Technology and Computational Linguistics, Vol. 36 No. 2
Louis Mahon | Carl Vogel
Journal for Language Technology and Computational Linguistics, Vol. 36 No. 2
This paper presents FASTFOOD, a rule-based natural language generation (NLG) program for cooking recipes. We consider the representation of cooking recipes as discourse representation, because the meaning of each sentence needs to consider the context of the others. Our discourse representation system is based on states of affairs and transtions between states of affairs, and does not use discourse referents. Recipes are generated by using an automated theorem-proving procedure to select the ingredients and instructions, with ingredients corresponding to axioms and instructions to implications. FASTFOOD also contains a temporal optimization module which can rearrange the recipe to make it more time efficient for the user, e.g. the recipe specifies to chop the vegetables while the rice is boiling. The system is described in detail, including the decision to forgo discourse referents and how plausible representations of nouns and verbs emerge purely as a by-product of the practical requirements of efficiently representing recipe content. A comparison is then made with existing recipe generation systems, NLG systems more generally, and automated theorem provers.
2022
Mutual Gaze and Linguistic Repetition in a Multimodal Corpus
Anais Murat | Maria Koutsombogera | Carl Vogel
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Anais Murat | Maria Koutsombogera | Carl Vogel
Proceedings of the Thirteenth Language Resources and Evaluation Conference
This paper investigates the correlation between mutual gaze and linguistic repetition, a form of alignment, which we take as evidence of mutual understanding. We focus on a multimodal corpus made of three-party conversations and explore the question of whether mutual gaze events correspond to moments of repetition or non-repetition. Our results, although mainly significant on word unigrams and bigrams, suggest positive correlations between the presence of mutual gaze and the repetitions of tokens, lemmas, or parts-of-speech, but negative correlations when it comes to paired levels of representation (tokens or lemmas associated with their part-of-speech). No compelling correlation is found with duration of mutual gaze. Results are strongest when ignoring punctuation as representations of pauses, intonation, etc. in counting aligned tokens.
Features and Categories of Hyperbole in Cyberbullying Discourse on Social Media
Simona Ignat | Carl Vogel
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Simona Ignat | Carl Vogel
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Cyberbullying discourse is achieved with multiple linguistic conveyances. Hyperboles witnessed in a corpus of cyberbullying utterances are studied. Linguistic features of hyperbole using the traditional grammatical indications of exaggerations are analyzed. The method relies on data selected from a larger corpus of utterances identified and labelled as “bullying”, from Twitter, from October 2020 to March 2022. An outcome is a lexicon of 250 entries. A small number of lexical level features have been isolated, and chi-squared contingency tests applied to evaluating their information value in identifying hyperbole. Words or affixes indicating superlatives or extremes of scales, with positive but not negative valency items, interact with hyperbole classification in this data set. All utterances extracted has been considered exaggerations and the stylistic status of “hyperbole” has been commented within the frame of new meanings in the context of social media.
2021
English Machine Reading Comprehension Datasets: A Survey
Daria Dzendzik | Jennifer Foster | Carl Vogel
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Daria Dzendzik | Jennifer Foster | Carl Vogel
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
This paper surveys 60 English Machine Reading Comprehension datasets, with a view to providing a convenient resource for other researchers interested in this problem. We categorize the datasets according to their question and answer form and compare them across various dimensions including size, vocabulary, data source, method of creation, human performance level, and first question word. Our analysis reveals that Wikipedia is by far the most common data source and that there is a relative lack of why, when, and where questions across datasets.
2020
An Evaluation Method for Diachronic Word Sense Induction
Ashjan Alsulaimani | Erwan Moreau | Carl Vogel
Findings of the Association for Computational Linguistics: EMNLP 2020
Ashjan Alsulaimani | Erwan Moreau | Carl Vogel
Findings of the Association for Computational Linguistics: EMNLP 2020
The task of Diachronic Word Sense Induction (DWSI) aims to identify the meaning of words from their context, taking the temporal dimension into account. In this paper we propose an evaluation method based on large-scale time-stamped annotated biomedical data, and a range of evaluation measures suited to the task. The approach is applied to two recent DWSI systems, thus demonstrating its relevance and providing an in-depth analysis of the models.
Q. Can Knowledge Graphs be used to Answer Boolean Questions? A. It’s complicated!
Daria Dzendzik | Carl Vogel | Jennifer Foster
Proceedings of the First Workshop on Insights from Negative Results in NLP
Daria Dzendzik | Carl Vogel | Jennifer Foster
Proceedings of the First Workshop on Insights from Negative Results in NLP
In this paper we explore the problem of machine reading comprehension, focusing on the BoolQ dataset of Yes/No questions. We carry out an error analysis of a BERT-based machine reading comprehension model on this dataset, revealing issues such as unstable model behaviour and some noise within the dataset itself. We then experiment with two approaches for integrating information from knowledge graphs: (i) concatenating knowledge graph triples to text passages and (ii) encoding knowledge with a Graph Neural Network. Neither of these approaches show a clear improvement and we hypothesize that this may be due to a combination of inaccuracies in the knowledge graph, imprecision in entity linking, and the models’ inability to capture additional information from knowledge graphs.
2019
Is It Dish Washer Safe? Automatically Answering “Yes/No” Questions Using Customer Reviews
Daria Dzendzik | Carl Vogel | Jennifer Foster
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Daria Dzendzik | Carl Vogel | Jennifer Foster
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
It has become commonplace for people to share their opinions about all kinds of products by posting reviews online. It has also become commonplace for potential customers to do research about the quality and limitations of these products by posting questions online. We test the extent to which reviews are useful in question-answering by combining two Amazon datasets and focusing our attention on yes/no questions. A manual analysis of 400 cases reveals that the reviews directly contain the answer to the question just over a third of the time. Preliminary reading comprehension experiments with this dataset prove inconclusive, with accuracy in the range 50-66%.
MSO with tests and reducts
Tim Fernando | David Woods | Carl Vogel
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing
Tim Fernando | David Woods | Carl Vogel
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing
Tests added to Kleene algebra (by Kozen and others) are considered within Monadic Second Order logic over strings, where they are likened to statives in natural language. Reducts are formed over tests and non-tests alike, specifying what is observable. Notions of temporal granularity are based on observable change, under the assumption that a finite set bounds what is observable (with the possibility of stretching such bounds by moving to a larger finite set). String projections at different granularities are conjoined by superpositions that provide another variant of concatenation for Booleans.
2018
Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
Erwan Moreau | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Erwan Moreau | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Chats and Chunks: Annotation and Analysis of Multiparty Long Casual Conversations
Emer Gilmartin | Carl Vogel | Nick Campbell
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Emer Gilmartin | Carl Vogel | Nick Campbell
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Modeling Collaborative Multimodal Behavior in Group Dialogues: The MULTISIMO Corpus
Maria Koutsombogera | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Maria Koutsombogera | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task
Akira Hayakawa | Carl Vogel | Saturnino Luz | Nick Campbell
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Akira Hayakawa | Carl Vogel | Saturnino Luz | Nick Campbell
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
A Diachronic Corpus for Literary Style Analysis
Carmen Klaussner | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Carmen Klaussner | Carl Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Chat,Chunk and Topic in Casual Conversation
Emer Gilmartin | Carl Vogel
Proceedings of the 14th Joint ACL-ISO Workshop on Interoperable Semantic Annotation
Emer Gilmartin | Carl Vogel
Proceedings of the 14th Joint ACL-ISO Workshop on Interoperable Semantic Annotation
CRF-Seq and CRF-DepTree at PARSEME Shared Task 2018: Detecting Verbal MWEs using Sequential and Dependency-Based Approaches
Erwan Moreau | Ashjan Alsulaimani | Alfredo Maldonado | Carl Vogel
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Erwan Moreau | Ashjan Alsulaimani | Alfredo Maldonado | Carl Vogel
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
This paper describes two systems for detecting Verbal Multiword Expressions (VMWEs) which both competed in the closed track at the PARSEME VMWE Shared Task 2018. CRF-DepTree-categs implements an approach based on the dependency tree, intended to exploit the syntactic and semantic relations between tokens; CRF-Seq-nocategs implements a robust sequential method which requires only lemmas and morphosyntactic tags. Both systems ranked in the top half of the ranking, the latter ranking second for token-based evaluation. The code for both systems is published under the GNU General Public License version 3.0 and is available at http://github.com/erwanm/adapt-vmwe18.
Just Talking - Modelling Casual Conversation
Emer Gilmartin | Christian Saam | Carl Vogel | Nick Campbell | Vincent Wade
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Emer Gilmartin | Christian Saam | Carl Vogel | Nick Campbell | Vincent Wade
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Casual conversation has become a focus for artificial dialogue applications. Such talk is ubiquitous and its structure differs from that found in the task-based interactions which have been the focus of dialogue system design for many years. It is unlikely that such conversations can be modelled as an extension of task-based talk. We review theories of casual conversation, report on our studies of the structure of casual dialogue, and outline challenges we see for the development of spoken dialog systems capable of carrying on casual friendly conversation in addition to performing well-defined tasks.
2017
ADAPT Centre Cone Team at IJCNLP-2017 Task 5: A Similarity-Based Logistic Regression Approach to Multi-choice Question Answering in an Examinations Shared Task
Daria Dzendzik | Alberto Poncelas | Carl Vogel | Qun Liu
Proceedings of the IJCNLP 2017, Shared Tasks
Daria Dzendzik | Alberto Poncelas | Carl Vogel | Qun Liu
Proceedings of the IJCNLP 2017, Shared Tasks
We describe the work of a team from the ADAPT Centre in Ireland in addressing automatic answer selection for the Multi-choice Question Answering in Examinations shared task. The system is based on a logistic regression over the string similarities between question, answer, and additional text. We obtain the highest grade out of six systems: 48.7% accuracy on a validation set (vs. a baseline of 29.45%) and 45.6% on a test set.
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking
Alfredo Maldonado | Lifeng Han | Erwan Moreau | Ashjan Alsulaimani | Koel Dutta Chowdhury | Carl Vogel | Qun Liu
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
Alfredo Maldonado | Lifeng Han | Erwan Moreau | Ashjan Alsulaimani | Koel Dutta Chowdhury | Carl Vogel | Qun Liu
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
A description of a system for identifying Verbal Multi-Word Expressions (VMWEs) in running text is presented. The system mainly exploits universal syntactic dependency features through a Conditional Random Fields (CRF) sequence model. The system competed in the Closed Track at the PARSEME VMWE Shared Task 2017, ranking 2nd place in most languages on full VMWE-based evaluation and 1st in three languages on token-based evaluation. In addition, this paper presents an option to re-rank the 10 best CRF-predicted sequences via semantic vectors, boosting its scores above other systems in the competition. We also show that all systems in the competition would struggle to beat a simple lookup baseline system and argue for a more purpose-specific evaluation scheme.
Towards efficient string processing of annotated events
David Woods | Tim Fernando | Carl Vogel
Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13)
David Woods | Tim Fernando | Carl Vogel
Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13)
2015
A hedging annotation scheme focused on epistemic phrases for informal language
Liliana Mamani Sanchez | Carl Vogel
Proceedings of the Workshop on Models for Modality Annotation
Liliana Mamani Sanchez | Carl Vogel
Proceedings of the Workshop on Models for Modality Annotation
Temporal Forces and Type Coercion in Strings
Derek Kelleher | Tim Fernando | Carl Vogel
Proceedings of the 12th International Conference on Finite-State Methods and Natural Language Processing 2015 (FSMNLP 2015 Düsseldorf)
Derek Kelleher | Tim Fernando | Carl Vogel
Proceedings of the 12th International Conference on Finite-State Methods and Natural Language Processing 2015 (FSMNLP 2015 Düsseldorf)
2014
Limitations of MT Quality Estimation Supervised Systems: The Tails Prediction Problem
Erwan Moreau | Carl Vogel
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Erwan Moreau | Carl Vogel
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
2013
Finite State Temporality and Context-Free Languages
Derek Kelleher | Carl Vogel
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers
Derek Kelleher | Carl Vogel
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers
Laugher and Topic Transition in Multiparty Conversation
Emer Gilmartin | Francesca Bonin | Carl Vogel | Nick Campbell
Proceedings of the SIGDIAL 2013 Conference
Emer Gilmartin | Francesca Bonin | Carl Vogel | Nick Campbell
Proceedings of the SIGDIAL 2013 Conference
IMHO: An Exploratory Study of Hedging in Web Forums
Liliana Mamani Sanchez | Carl Vogel
Proceedings of the SIGDIAL 2013 Conference
Liliana Mamani Sanchez | Carl Vogel
Proceedings of the SIGDIAL 2013 Conference
2012
Towards the Automatic Detection of the Source Language of a Literary Translation.
Gerard Lynch | Carl Vogel
Proceedings of COLING 2012: Posters
Gerard Lynch | Carl Vogel
Proceedings of COLING 2012: Posters
A Naive Bayes classifier for automatic correction of preposition and determiner errors in ESL text
Gerard Lynch | Erwan Moreau | Carl Vogel
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Gerard Lynch | Erwan Moreau | Carl Vogel
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Quality Estimation: an experimental study using unsupervised similarity measures
Erwan Moreau | Carl Vogel
Proceedings of the Seventh Workshop on Statistical Machine Translation
Erwan Moreau | Carl Vogel
Proceedings of the Seventh Workshop on Statistical Machine Translation
2010
Exploiting CCG Structures with Tree Kernels for Speculation Detection
Liliana Mamani Sánchez | Baoli Li | Carl Vogel
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task
Liliana Mamani Sánchez | Baoli Li | Carl Vogel
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task
2009
Exploring Multilingual Semantic Role Labeling
Baoli Li | Martin Emms | Saturnino Luz | Carl Vogel
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task
Baoli Li | Martin Emms | Saturnino Luz | Carl Vogel
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task
1996
Search
Fix author
Co-authors
- Erwan Moreau 7
- Nick Campbell 4
- Daria Dzendzik 4
- Emer Gilmartin 4
- Ashjan Alsulaimani 3
- Tim Fernando 3
- Jennifer Foster 3
- Liliana Mamani Sanchez 3
- Derek Kelleher 2
- Maria Koutsombogera 2
- Baoli Li 2
- Qun Liu 2
- Saturnino Luz 2
- Gerard Lynch 2
- Alfredo Maldonado 2
- David Woods 2
- Francesca Bonin 1
- Holly Branigan 1
- Koel Dutta Chowdhury 1
- Martin Emms 1
- Ulrike Hahn 1
- Lifeng Han 1
- Akira Hayakawa 1
- Simona Ignat 1
- Carmen Klaussner 1
- Louis Mahon 1
- Anais Murat 1
- Alberto Poncelas 1
- Christian Saam 1
- Vincent Wade 1