Mihai Rotaru


2019

pdf bib
Best Practices for Learning Domain-Specific Cross-Lingual Embeddings
Lena Shakurova | Beata Nyari | Chao Li | Mihai Rotaru
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Cross-lingual embeddings aim to represent words in multiple languages in a shared vector space by capturing semantic similarities across languages. They are a crucial component for scaling tasks to multiple languages by transferring knowledge from languages with rich resources to low-resource languages. A common approach to learning cross-lingual embeddings is to train monolingual embeddings separately for each language and learn a linear projection from the monolingual spaces into a shared space, where the mapping relies on a small seed dictionary. While there are high-quality generic seed dictionaries and pre-trained cross-lingual embeddings available for many language pairs, there is little research on how they perform on specialised tasks. In this paper, we investigate the best practices for constructing the seed dictionary for a specific domain. We evaluate the embeddings on the sequence labelling task of Curriculum Vitae parsing and show that the size of a bilingual dictionary, the frequency of the dictionary words in the domain corpora and the source of data (task-specific vs generic) influence performance. We also show that the less training data is available in the low-resource language, the more the construction of the bilingual dictionary matters, and demonstrate that some of the choices are crucial in the zero-shot transfer learning case.

2016

pdf bib
Learning Text Similarity with Siamese Recurrent Networks
Paul Neculoiu | Maarten Versteegh | Mihai Rotaru
Proceedings of the 1st Workshop on Representation Learning for NLP

2015

pdf bib
Word Embeddings vs Word Types for Sequence Labeling: the Curious Case of CV Parsing
Melanie Tosik | Carsten Lygteskov Hansen | Gerard Goossen | Mihai Rotaru
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

2009

pdf bib
Discourse Structure and Performance Analysis: Beyond the Correlation
Mihai Rotaru | Diane Litman
Proceedings of the SIGDIAL 2009 Conference

2007

pdf bib
Exploring Affect-Context Dependencies for Adaptive System Development
Kate Forbes-Riley | Mihai Rotaru | Diane Litman | Joel Tetreault
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf bib
The Utility of a Graphical Representation of Discourse Structure in Spoken Dialogue Systems
Mihai Rotaru | Diane Litman
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Dependencies between Student State and Speech Recognition Problems in Spoken Tutoring Dialogues
Mihai Rotaru | Diane J. Litman
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Exploiting Discourse Structure for Spoken Dialogue Performance Analysis
Mihai Rotaru | Diane J. Litman
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2003

pdf bib
Exceptionality and Natural Language Learning
Mihai Rotaru | Diane J. Litman
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003