Ruhi Sarikaya - ACL Anthology

Ruhi Sarikaya

Also published as: Ruhi Srikaya

2024

EVEDIT: Event-based Knowledge Editing for Deterministic Knowledge Propagation
Jiateng Liu | Pengfei Yu | Yuji Zhang | Sha Li | Zixuan Zhang | Ruhi Sarikaya | Kevin Small | Heng Ji
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The dynamic nature of real-world information necessitates knowledge editing (KE) in large language models (LLMs). The edited knowledge should propagate and facilitate the deduction of new information based on existing model knowledge. We term the existing related knowledge in LLM serving as the origination of knowledge propagation as ”deduction anchors”. However, current KE approaches, which only operate on (subject, relation, object) triple. We both theoretically and empirically observe that this simplified setting often leads to uncertainty when determining the deduction anchors, causing low confidence in their answers. To mitigate this issue, we propose a novel task of event-based knowledge editing that pairs facts with event descriptions. This task manifests not only a closer simulation of real-world editing scenarios but also a more logically sound setting, implicitly defining the deduction anchor and enabling LLMs to propagate knowledge confidently. We curate a new benchmark dataset Evedit derived from the CounterFact dataset and validate its superiority in improving model confidence. Moreover, while we observe that the event-based setting is significantly challenging for existing approaches, we propose a novel approach Self-Edit that showcases stronger performance, achieving 55.6% consistency improvement while maintaining the naturalness of generation.

2021

A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems
Sunghyun Park | Han Li | Ameen Patel | Sidharth Mudgal | Sungjin Lee | Young-Bum Kim | Spyros Matsoukas | Ruhi Sarikaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request. We propose a scalable and automatic approach for improving NLU in a large-scale conversational AI system by leveraging implicit user feedback, with an insight that user interaction data and dialog context have rich information embedded from which user satisfaction and intention can be inferred. In particular, we propose a domain-agnostic framework for curating new supervision data for improving NLU from live production traffic. With an extensive set of experiments, we show the results of applying the framework and improving NLU for a large-scale production system across 10 domains.

Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang | Sungjin Lee | Sunghyun Park | Han Li | Young-Bum Kim | Ruhi Sarikaya
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2019

Locale-agnostic Universal Domain Classification Model in Spoken Language Understanding
Jihwan Lee | Ruhi Sarikaya | Young-Bum Kim
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

In this paper, we introduce an approach for leveraging available data across multiple locales sharing the same language to 1) improve domain classification model accuracy in Spoken Language Understanding and user experience even if new locales do not have sufficient data and 2) reduce the cost of scaling the domain classifier to a large number of locales. We propose a locale-agnostic universal domain classification model based on selective multi-task learning that learns a joint representation of an utterance over locales with different sets of domains and allows locales to share knowledge selectively depending on the domains. The experimental results demonstrate the effectiveness of our approach on domain classification task in the scenario of multiple locales with imbalanced data and disparate domain sets. The proposed approach outperforms other baselines models especially when classifying locale-specific domains and also low-resourced domains.

Continuous Learning for Large-scale Personalized Domain Classification
Han Li | Jihwan Lee | Sidharth Mudgal | Ruhi Sarikaya | Young-Bum Kim
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Domain classification is the task to map spoken language utterances to one of the natural language understanding domains in intelligent personal digital assistants (IPDAs). This is observed in mainstream IPDAs in industry and third-party domains are developed to enhance the capability of the IPDAs. As more and more new domains are developed very frequently, how to continuously accommodate the new domains still remains challenging. Moreover, if one wants to use personalized information dynamically for better domain classification, it is infeasible to directly adopt existing continual learning approaches. In this paper, we propose CoNDA, a neural-based approach for continuous domain adaption with normalization and regularization. Unlike existing methods that often conduct full model parameter update, CoNDA only updates the necessary parameters in the model for the new domains. Empirical evaluation shows that CoNDA achieves high accuracy on both the accommodated new domains and the existing known domains for which input samples come with personal information, and outperforms the baselines by a large margin.

2018

Efficient Large-Scale Neural Domain Classification with Personalized Attention
Young-Bum Kim | Dongchan Kim | Anjishnu Kumar | Ruhi Sarikaya
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs). This scenario is observed in mainstream IPDAs in industry that allow third parties to develop thousands of new domains to augment built-in first party domains to rapidly increase domain coverage and overall IPDA capabilities. We propose a scalable neural model architecture with a shared encoder, a novel attention mechanism that incorporates personalization information and domain-specific classifiers that solves the problem efficiently. Our architecture is designed to efficiently accommodate incremental domain additions achieving two orders of magnitude speed up compared to full model retraining. We consider the practical constraints of real-time production systems, and design to minimize memory footprint and runtime latency. We demonstrate that incorporating personalization significantly improves domain classification accuracy in a setting with thousands of overlapping domains.

A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding
Young-Bum Kim | Dongchan Kim | Joo-Kyung Kim | Ruhi Sarikaya
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

Intelligent personal digital assistants (IPDAs), a popular real-life application with spoken language understanding capabilities, can cover potentially thousands of overlapping domains for natural language understanding, and the task of finding the best domain to handle an utterance becomes a challenging problem on a large scale. In this paper, we propose a set of efficient and scalable shortlisting-reranking neural models for effective large-scale domain classification for IPDAs. The shortlisting stage focuses on efficiently trimming all domains down to a list of k-best candidate domains, and the reranking stage performs a list-wise reranking of the initial k-best domains with additional contextual information. We show the effectiveness of our approach with extensive experiments on 1,500 IPDA domains.

2017

Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources
Joo-Kyung Kim | Young-Bum Kim | Ruhi Sarikaya | Eric Fosler-Lussier
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Training a POS tagging model with crosslingual transfer learning usually requires linguistic knowledge and resources about the relation between the source language and the target language. In this paper, we introduce a cross-lingual transfer learning model for POS tagging without ancillary resources such as parallel corpora. The proposed cross-lingual model utilizes a common BLSTM that enables knowledge transfer from other languages, and private BLSTMs for language-specific representations. The cross-lingual model is trained with language-adversarial training and bidirectional language modeling as auxiliary objectives to better represent language-general information while not losing the information about a specific target language. Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, we show that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.

2016

Natural Language Model Re-usability for Scaling to Different Domains
Young-Bum Kim | Alexandre Rochette | Ruhi Sarikaya
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Scalable Semi-Supervised Query Classification Using Matrix Sketching
Young-Bum Kim | Karl Stratos | Ruhi Sarikaya
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Domainless Adaptation by Constrained Decoding on a Schema Lattice
Young-Bum Kim | Karl Stratos | Ruhi Sarikaya
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In many applications such as personal digital assistants, there is a constant need for new domains to increase the system’s coverage of user queries. A conventional approach is to learn a separate model every time a new domain is introduced. This approach is slow, inefficient, and a bottleneck for scaling to a large number of domains. In this paper, we introduce a framework that allows us to have a single model that can handle all domains: including unknown domains that may be created in the future as long as they are covered in the master schema. The key idea is to remove the need for distinguishing domains by explicitly predicting the schema of queries. Given permitted schema of a query, we perform constrained decoding on a lattice of slot sequences allowed under the schema. The proposed model achieves competitive and often superior performance over the conventional model trained separately per domain.

Frustratingly Easy Neural Domain Adaptation
Young-Bum Kim | Karl Stratos | Ruhi Sarikaya
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Popular techniques for domain adaptation such as the feature augmentation method of Daumé III (2009) have mostly been considered for sparse binary-valued features, but not for dense real-valued features such as those used in neural networks. In this paper, we describe simple neural extensions of these techniques. First, we propose a natural generalization of the feature augmentation method that uses K + 1 LSTMs where one model captures global patterns across all K domains and the remaining K models capture domain-specific information. Second, we propose a novel application of the framework for learning shared structures by Ando and Zhang (2005) to domain adaptation, and also provide a neural extension of their approach. In experiments on slot tagging over 17 domains, our methods give clear performance improvement over Daumé III (2009) applied on feature-rich CRFs.

Drop-out Conditional Random Fields for Twitter with Huge Mined Gazetteer
Eunsuk Yang | Young-Bum Kim | Ruhi Sarikaya | Yu-Seop Kim
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2015

Compact Lexicon Selection with Spectral Methods
Young-Bum Kim | Karl Stratos | Xiaohu Liu | Ruhi Sarikaya
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

New Transfer Learning Techniques for Disparate Label Sets
Young-Bum Kim | Karl Stratos | Ruhi Sarikaya | Minwoo Jeong
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Weakly Supervised Slot Tagging with Partially Labeled Sequences from Web Search Click Logs
Young-Bum Kim | Minwoo Jeong | Karl Stratos | Ruhi Sarikaya
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Pre-training of Hidden-Unit CRFs
Young-Bum Kim | Karl Stratos | Ruhi Sarikaya
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Part-of-speech Taggers for Low-resource Languages using CCA Features
Young-Bum Kim | Benjamin Snyder | Ruhi Sarikaya
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

Resolving Referring Expressions in Conversational Dialogs for Natural User Interfaces
Asli Celikyilmaz | Zhaleh Feizollahi | Dilek Hakkani-Tur | Ruhi Sarikaya
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression
Asli Celikyilmaz | Dilek Hakkani-Tur | Gokhan Tur | Ruhi Sarikaya
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2009

Tied-Mixture Language Modeling in Continuous Space
Ruhi Sarikaya | Mohamed Afify | Brian Kingsbury
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2007

Joint Morphological-Lexical Language Modeling for Machine Translation
Ruhi Sarikaya | Yonggang Deng
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

2006

IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator
Yuqing Gao | Bowen Zhou | Ruhi Sarikaya | Mohamed Afify | Hong-Kwang Kuo | Wei-zhong Zhu | Yonggang Deng | Charles Prosser | Wei Zhang | Laurent Besacier
Proceedings of the First International Workshop on Medical Speech Translation

Maximum Entropy Based Restoration of Arabic Diacritics
Imed Zitouni | Jeffrey S. Sorensen | Ruhi Sarikaya
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2004

A Comparison of Rule–Based and Statistical Methods for Semantic Language Modeling and Confidence Measurement
Ruhi Srikaya | Yuqing Gao | Michael Picheny
Proceedings of HLT-NAACL 2004: Short Papers

Co-authors

Mohamed Afify 2

Yonggang Deng 2

Zhaleh Feizollahi 2

Dilek Hakkani-Tur 2

Joo-Kyung Kim 2

Sidharth Mudgal 2

Sunghyun Park 2

Alexandre Rochette 2

Vipul Agarwal 1

Khushboo Aggarwal 1

Tasos Anastasakos 1

Laurent Besacier 1

Senthilkumar Chandramohan 1

Paul A. Crook 1

Eric Fosler-Lussier 1

Roman Holenstein 1

Brian Kingsbury 1

Elizabeth Krawczyk 1

Anjishnu Kumar 1

Hong-Kwang Kuo 1

Spyros Matsoukas 1

Michael Picheny 1

Charles Prosser 1

Vasiliy Radostev 1

Nikhil Ramesh 1

Jean-Phillipe Robichaud 1

Benjamin Snyder 1

Jeffrey Sorensen 1

Logan Stromberg 1

Wei-zhong Zhu 1

Venues