Márton Miháltz


pdf bib
Mapping Ontologies Using Ontologies: Cross-lingual Semantic Role Information Transfer
Balázs Indig | Márton Miháltz | András Simonyi
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents the process of enriching the verb frame database of a Hungarian natural language parser to enable the assignment of semantic roles. We accomplished this by linking the parser’s verb frame database to existing linguistic resources such as VerbNet and WordNet, and automatically transferring back semantic knowledge. We developed OWL ontologies that map the various constraint description formalisms of the linked resources and employed a logical reasoning device to facilitate the linking procedure. We present results and discuss the challenges and pitfalls that arose from this undertaking.


pdf bib
Beyond Sentiment: Social Psychological Analysis of Political Facebook Comments in Hungary
Márton Miháltz | Tamás Váradi | István Csertő | Éva Fülöp | Tibor Pólya | Pál Kővágó
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis


pdf bib
What Do We Drink? Automatically Extending Hungarian WordNet With Selectional Preference Relations
Márton Miháltz | Bálint Sass
Proceedings of the Joint Symposium on Semantic Processing. Textual Inference and Structures in Corpora


pdf bib
Knowledge-based Coreference Resolution for Hungarian
Márton Miháltz
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present a knowledge-based coreference resolution system for noun phrases in Hungarian texts. The system is used as a module in an automated psychological text processing project. Our system uses rules that rely on knowledge from the morphological, syntactic and semantic output of a deep parser and semantic relations form the Hungarian WordNet ontology. We also use rules that rely on Binding Theory, research results in Hungarian psycholinguistics, current research on proper name coreference identification and our own heuristics. We describe the constraints-and-preferences algorithm in detail that attempts to find coreference information for proper names, common nouns, pronouns and zero pronouns in texts. We present evaluation results for our system on a corpus manually annotated with coreference relations. Precision of the resolution of various coreference types reaches up to 80%, while overall recall is 63%. We also present an investigation of the various error types our system produced along with an analysis of the results.


pdf bib
Exploiting Parallel Corpora for Supervised Word-Sense Disambiguation in English-Hungarian Machine Translation
Márton Miháltz | Gábor Pohl
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we present an experiment to automatically generate annotated training corpora for a supervised word sense disambiguation module operating in an English-Hungarian and a Hungarian-English machine translation system. Training examples for the WSD module of the MT system are produced by annotating ambiguous lexical items in the source language (words having several possible translations) with their proper target language translations. Since manually annotating training examples is very costly, we are experimenting with a method to produce examples automatically from parallel corpora. Our algorithm relies on monolingual and bilingual lexicons and dictionaries in addition to statistical methods in order to annotate examples extracted from a large English-Hungarian parallel corpus accurately aligned at sentence level. In the paper, we present an experiment with the English noun state, where we categorized the different occurrences in the Hunglish parallel corpus. For this noun, most of the examples were covered by multiword lexical items originating from our lexical sources.


pdf bib
Word Sense Disambiguation Using Random Indexing
Márton Miháltz
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)


pdf bib
Automatism and User Interaction: Building a Hungarian WordNet
Gábor Prószéky | Márton Miháltz
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)