P. K. Srijith


2024

pdf bib
TL-CL: Task And Language Incremental Continual Learning
Shrey Satapara | P. K. Srijith
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

This paper introduces and investigates the problem of Task and Language Incremental Continual Learning (TLCL), wherein a multilingual model is systematically updated to accommodate new tasks in previously learned languages or new languages for established tasks. This significant yet previously unexplored area holds substantial practical relevance as it mirrors the dynamic requirements of real-world applications. We benchmark a representative set of continual learning (CL) algorithms for TLCL. Furthermore, we propose Task and Language-Specific Adapters (TLSA), an adapter-based parameter-efficient fine-tuning strategy. TLSA facilitates cross-lingual and cross-task transfer and outperforms other parameter-efficient fine-tuning techniques. Crucially, TLSA reduces parameter growth stemming from saving adapters to linear complexity from polynomial complexity as it was with parameter isolation-based adapter tuning. We conducted experiments on several NLP tasks arising across several languages. We observed that TLSA outperforms all other parameter-efficient approaches without requiring access to historical data for replay.

2022

pdf bib
Bi-Directional Recurrent Neural Ordinary Differential Equations for Social Media Text Classification
Maunika Tamire | Srinivas Anumasa | P. K. Srijith
Proceedings of the 2nd Workshop on Deriving Insights from User-Generated Text

Classification of posts in social media such as Twitter is difficult due to the noisy and short nature of texts. Sequence classification models based on recurrent neural networks (RNN) are popular for classifying posts that are sequential in nature. RNNs assume the hidden representation dynamics to evolve in a discrete manner and do not consider the exact time of the posting. In this work, we propose to use recurrent neural ordinary differential equations (RNODE) for social media post classification which consider the time of posting and allow the computation of hidden representation to evolve in a time-sensitive continuous manner. In addition, we propose a novel model, Bi-directional RNODE (Bi-RNODE), which can consider the information flow in both the forward and backward directions of posting times to predict the post label. Our experiments demonstrate that RNODE and Bi-RNODE are effective for the problem of stance classification of rumours in social media.

2020

pdf bib
Evaluation of Deep Gaussian Processes for Text Classification
P. Jayashree | P. K. Srijith
Proceedings of the Twelfth Language Resources and Evaluation Conference

With the tremendous success of deep learning models on computer vision tasks, there are various emerging works on the Natural Language Processing (NLP) task of Text Classification using parametric models. However, it constrains the expressability limit of the function and demands enormous empirical efforts to come up with a robust model architecture. Also, the huge parameters involved in the model causes over-fitting when dealing with small datasets. Deep Gaussian Processes (DGP) offer a Bayesian non-parametric modelling framework with strong function compositionality, and helps in overcoming these limitations. In this paper, we propose DGP models for the task of Text Classification and an empirical comparison of the performance of shallow and Deep Gaussian Process models is made. Extensive experimentation is performed on the benchmark Text Classification datasets such as TREC (Text REtrieval Conference), SST (Stanford Sentiment Treebank), MR (Movie Reviews), R8 (Reuters-8), which demonstrate the effectiveness of DGP models.

2016

pdf bib
Studying the Temporal Dynamics of Word Co-occurrences: An Application to Event Detection
Daniel Preoţiuc-Pietro | P. K. Srijith | Mark Hepple | Trevor Cohn
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Streaming media provides a number of unique challenges for computational linguistics. This paper studies the temporal variation in word co-occurrence statistics, with application to event detection. We develop a spectral clustering approach to find groups of mutually informative terms occurring in discrete time frames. Experiments on large datasets of tweets show that these groups identify key real world events as they occur in time, despite no explicit supervision. The performance of our method rivals state-of-the-art methods for event detection on F-score, obtaining higher recall at the expense of precision.

pdf bib
Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter
Michal Lukasik | P. K. Srijith | Duy Vu | Kalina Bontcheva | Arkaitz Zubiaga | Trevor Cohn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Modeling Tweet Arrival Times using Log-Gaussian Cox Processes
Michal Lukasik | P. K. Srijith | Trevor Cohn | Kalina Bontcheva
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing