Anssi Yli-Jyrä


2020

pdf bib
HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual Morpheme Alignment
Anssi Yli-Jyrä | Josi Purhonen | Matti Liljeqvist | Arto Antturi | Pekka Nieminen | Kari M. Räntilä | Valtter Luoto
Proceedings of the 12th Language Resources and Evaluation Conference

Twenty-five years ago, morphologically aligned Hebrew-Finnish and Greek-Finnish bitexts (texts accompanied by a translation) were constructed manually in order to create an analytical concordance (Luoto et al., eds. 1997) for a Finnish Bible translation. The creators of the bitexts recently secured the publisher’s permission to release its fine-grained alignment, but the alignment was still dependent on proprietary, third-party resources such as a copyrighted text edition and proprietary morphological analyses of the source texts. In this paper, we describe a nontrivial editorial process starting from the creation of the original one-purpose database and ending with its reconstruction using only freely available text editions and annotations. This process produced an openly available dataset that contains (i) the source texts and their translations, (ii) the morphological analyses, (iii) the cross-lingual morpheme alignments.

2019

pdf bib
Transition-Based Coding and Formal Language Theory for Ordered Digraphs
Anssi Yli-Jyrä
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing

Transition-based parsing of natural language uses transition systems to build directed annotation graphs (digraphs) for sentences. In this paper, we define, for an arbitrary ordered digraph, a unique decomposition and a corresponding linear encoding that are associated bijectively with each other via a new transition system. These results give us an efficient and succinct representation for digraphs and sets of digraphs. Based on the system and our analysis of its syntactic properties, we give structural bounds under which the set of encoded digraphs is restricted and becomes a context-free or a regular string language. The context-free restriction is essentially a superset of the encodings used previously to characterize properties of noncrossing digraphs and to solve maximal subgraphs problems. The regular restriction with a tight bound is shown to capture the Universal Dependencies v2.4 treebanks in linguistics.

2017

pdf bib
Bounded-Depth High-Coverage Search Space for Noncrossing Parses
Anssi Yli-Jyrä
Proceedings of the 13th International Conference on Finite State Methods and Natural Language Processing (FSMNLP 2017)

pdf bib
Generic Axiomatization of Families of Noncrossing Graphs in Dependency Parsing
Anssi Yli-Jyrä | Carlos Gómez-Rodríguez
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a simple encoding for unlabeled noncrossing graphs and show how its latent counterpart helps us to represent several families of directed and undirected graphs used in syntactic and semantic parsing of natural language as context-free languages. The families are separated purely on the basis of forbidden patterns in latent encoding, eliminating the need to differentiate the families of non-crossing graphs in inference algorithms: one algorithm works for all when the search space can be controlled in parser input.

2015

pdf bib
Three Equivalent Codes for Autosegmental Representations
Anssi Yli-Jyrä
Proceedings of the 12th International Conference on Finite-State Methods and Natural Language Processing 2015 (FSMNLP 2015 Düsseldorf)

2013

pdf bib
The mathematics of language learning
András Kornai | Gerald Penn | James Rogers | Anssi Yli-Jyrä
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)

pdf bib
On Finite-State Tonology with Autosegmental Representations
Anssi Yli-Jyrä
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing

2012

pdf bib
Implementation of Replace Rules Using Preference Operator
Senka Drobac | Miikka Silfverberg | Anssi Yli-Jyrä
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

pdf bib
Refining the Design of a Contracting Finite-State Dependency Parser
Anssi Yli-Jyrä | Jussi Piitulainen | Atro Voutilainen
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

2011

pdf bib
Compiling Simple Context Restrictions with Nondeterministic Automata
Anssi Yli-Jyrä
Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing

pdf bib
Explorations on Positionwise Flag Diacritics in Finite-State Morphology
Anssi Yli-Jyrä
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)

2009

pdf bib
An Efficient Double Complementation Algorithm for Superposition-Based Finite-State Morphology
Anssi Yli-Jyrä
Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)

2004

pdf bib
Axiomatization of Restricted Non-Projective Dependency Trees through Finite-State Constraints that Analyse Crossing Bracketings
Anssi Yli-Jyrä
Proceedings of the Workshop on Recent Advances in Dependency Grammar

2003

pdf bib
Describing Syntax with Star-Free Regular Expressions
Anssi Yli-Jyrä
10th Conference of the European Chapter of the Association for Computational Linguistics