Hans van Halteren

Also published as: Hans Van Halteren


2020

pdf bib
The Connection between the Text and Images of News Articles: New Insights for Multimedia Analysis
Nelleke Oostdijk | Hans van Halteren | Erkan Bașar | Martha Larson
Proceedings of the Twelfth Language Resources and Evaluation Conference

We report on a case study of text and images that reveals the inadequacy of simplistic assumptions about their connection and interplay. The context of our work is a larger effort to create automatic systems that can extract event information from online news articles about flooding disasters. We carry out a manual analysis of 1000 articles containing a keyword related to flooding. The analysis reveals that the articles in our data set cluster into seven categories related to different topical aspects of flooding, and that the images accompanying the articles cluster into five categories related to the content they depict. The results demonstrate that flood-related news articles do not consistently report on a single, currently unfolding flooding event and we should also not assume that a flood-related image will directly relate to a flooding-event described in the corresponding article. In particular, spatiotemporal distance is important. We validate the manual analysis with an automatic classifier demonstrating the technical feasibility of multimedia analysis approaches that admit more realistic relationships between text and images. In sum, our case study confirms that closer attention to the connection between text and images has the potential to improve the collection of multimodal information from news articles.

2019

pdf bib
Team Taurus at SemEval-2019 Task 9: Expert-informed pattern recognition for suggestion mining
Nelleke Oostdijk | Hans van Halteren
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper presents our submissions to SemEval-2019 Task9, Suggestion Mining. Our system is one in a series of systems in which we compare an approach using expert-defined rules with a comparable one using machine learning. We target tasks with a syntactic or semantic component that might be better described by a human understanding the task than by a machine learner only able to count features. For Semeval-2019 Task 9, the expert rules clearly outperformed our machine learning model when training and testing on equally balanced testsets.

2018

pdf bib
Identification of Differences between Dutch Language Varieties with the VarDial2018 Dutch-Flemish Subtitle Data
Hans van Halteren | Nelleke Oostdijk
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

With the goal of discovering differences between Belgian and Netherlandic Dutch, we participated as Team Taurus in the Dutch-Flemish Subtitles task of VarDial2018. We used a rather simple marker-based method, but a wide range of features, including lexical, lexico-syntactic and syntactic ones, and achieved a second position in the ranking. Inspection of highly distin-guishing features did point towards differences between the two language varieties, but because of the nature of the experimental data, we have to treat our observations as very tentative and in need of further investigation.

2008

pdf bib
Source Language Markers in EUROPARL Translations
Hans van Halteren
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2004

pdf bib
Linguistic Profiling for Authorship Recognition and Verification
Hans van Halteren
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Evaluating Information Content by Factoid Analysis: Human annotation and stability
Simone Teufel | Hans van Halteren
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
Linguistic profiling of texts for the purpose of language verification
Hans van Halteren | Nelleke Oostdijk
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Agreement in Human Factoid Annotation for Summarization Evaluation
Simone Teufel | Hans van Halteren
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Examining the consensus between human summaries: initial experiments with factoid analysis
Hans van Halteren | Simone Teufel
Proceedings of the HLT-NAACL 03 Text Summarization Workshop

2002

pdf bib
Teaching NLP/CL through Games: the Case of Parsing
Hans van Halteren
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics

2001

pdf bib
Improving Accuracy in word class tagging through the Combination of Machine Learning Systems
Hans Van Halteren | Jakub Zavrel | Walter Daelemans
Computational Linguistics, Volume 27, Number 2, June 2001

2000

pdf bib
A Default First Order Family Weight Determination Procedure for WPDV Models
Hans van Halteren
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
Chunking with WPDV Models
Hans van Halteren
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
The Detection of Inconsistency in Manually Tagged Text
Hans van Halteren
Proceedings of the COLING-2000 Workshop on Linguistically Interpreted Corpora

1998

pdf bib
Improving Data Driven Wordclass Tagging by System Combination
Hans van Halteren | Jakub Zavrel | Walter Daelemans
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Improving Data Driven Wordclass Tagging by System Combination
Hans van Halteren | Jakub Zavrel | Walter Daelemans
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics