Ruli Manurung

Also published as: R. Manurung


2018

pdf bib
Cross-Lingual and Supervised Learning Approach for Indonesian Word Sense Disambiguation Task
Rahmad Mahendra | Heninggar Septiantri | Haryo Akbarianto Wibowo | Ruli Manurung | Mirna Adriani
Proceedings of the 9th Global Wordnet Conference

Ambiguity is a problem we frequently face in Natural Language Processing. Word Sense Disambiguation (WSD) is a task to determine the correct sense of an ambiguous word. However, research in WSD for Indonesian is still rare to find. The availability of English-Indonesian parallel corpora and WordNet for both languages can be used as training data for WSD by applying Cross-Lingual WSD method. This training data is used as an input to build a model using supervised machine learning algorithms. Our research also examines the use of Word Embedding features to build the WSD model.

2017

pdf bib
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman | Martin Popel | Milan Straka | Jan Hajič | Joakim Nivre | Filip Ginter | Juhani Luotolahti | Sampo Pyysalo | Slav Petrov | Martin Potthast | Francis Tyers | Elena Badmaeva | Memduh Gokirmak | Anna Nedoluzhko | Silvie Cinková | Jan Hajič jr. | Jaroslava Hlaváčová | Václava Kettnerová | Zdeňka Urešová | Jenna Kanerva | Stina Ojala | Anna Missilä | Christopher D. Manning | Sebastian Schuster | Siva Reddy | Dima Taji | Nizar Habash | Herman Leung | Marie-Catherine de Marneffe | Manuela Sanguinetti | Maria Simi | Hiroshi Kanayama | Valeria de Paiva | Kira Droganova | Héctor Martínez Alonso | Çağrı Çöltekin | Umut Sulubacak | Hans Uszkoreit | Vivien Macketanz | Aljoscha Burchardt | Kim Harris | Katrin Marheinecke | Georg Rehm | Tolga Kayadelen | Mohammed Attia | Ali Elkahky | Zhuoran Yu | Emily Pitler | Saran Lertpradit | Michael Mandl | Jesse Kirchner | Hector Fernandez Alcalde | Jana Strnadová | Esha Banerjee | Ruli Manurung | Antonio Stella | Atsuko Shimada | Sookyoung Kwak | Gustavo Mendonça | Tatiana Lando | Rattima Nitisaroj | Josie Li
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

2015

pdf bib
Measuring the Structural and Conceptual Similarity of Folktales using Plot Graphs
Victoria Anugrah Lestari | Ruli Manurung
Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

pdf bib
Automatic Identification of Age-Appropriate Ratings of Song Lyrics
Anggi Maulidyani | Ruli Manurung
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Automatic Wayang Ontology Construction using Relation Extraction from Free Text
Hadaiq Sanabila | Ruli Manurung
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

2012

pdf bib
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation
Ruli Manurung | Francis Bond
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

pdf bib
A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing
Bayu Distiawan | Ruli Manurung
Proceedings of the Sixth Linguistic Annotation Workshop

2010

pdf bib
Developing an Online Indonesian Corpora Repository
Ruli Manurung | Bayu Distiawan | Desmond Darma Putra
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Representing Story Plans in SUMO
Jeffrey Cua | Ruli Manurung | Ethel Ong | Adam Pease
Proceedings of the NAACL HLT 2010 Second Workshop on Computational Approaches to Linguistic Creativity

2008

pdf bib
Comparing the Value of Latent Semantic Analysis on two English-to-Indonesian lexical mapping tasks
Eliza Margaretha | Ruli Manurung
Proceedings of the Australasian Language Technology Association Workshop 2008

pdf bib
A Two-Level Morphological Analyser for the Indonesian Language
Femphy Pisceldo | Rahmad Mahendra | Ruli Manurung | I Wayan Arka
Proceedings of the Australasian Language Technology Association Workshop 2008

pdf bib
Extending an Indonesian Semantic Analysis-based Question Answering System with Linguistic and World Knowledge Axioms
Rahmad Mahendra | Septina Dian Larasati | Ruli Manurung
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

pdf bib
An Implementation of a Flexible Author-Reviewer Model of Generation using Genetic Algorithms
Ruli Manurung | Graeme Ritchie | Henry Thompson
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation

2006

pdf bib
Building a Lexical Database for an Interactive Joke-Generator
R. Manurung | D. O’Mara | H. Pain | G. Ritchie | A. Waller
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

As part of a project to construct an interactive program which will encourage children to play with language by building jokes, we have developed a large lexical database, closely based on WordNet. As well as the standard WordNet information about part of speech, synonymy, hyponymy, etc, we have added phonetic representations and symbolic links allowing attachment of pictures. All information is represented in a relational database, allowing powerful searches using SQL via a Java API. The lexicon has a facility to label subsets of the lexicon with symbolic names, and we are working to incorporate some educationally relevant word lists as sublexicons. This should also allow us to improve the familiarity ratings which the lexicon assigns to words.