pdf
bib
Proceedings of the Workshop on Joint NLP Modelling for Conversational AI @ ICON 2020
Praveen Kumar G S
|
Siddhartha Mukherjee
|
Ranjan Samal
pdf
bib
abs
Neighbor Contextual Information Learners for Joint Intent and Slot Prediction
Bharatram Natarajan
|
Gaurav Mathur
|
Sameer Jain
Intent Identification and Slot Identification aretwo important task for Natural Language Understanding(NLU). Exploration in this areahave gained significance using networks likeRNN, LSTM and GRU. However, modelscontaining the above modules are sequentialin nature, which consumes lot of resourceslike memory to train the model in cloud itself. With the advent of many voice assistantsdelivering offline solutions for manyapplications, there is a need for finding replacementfor such sequential networks. Explorationin self-attention, CNN modules hasgained pace in the recent times. Here we exploreCNN based models like Trellis and modifiedthe architecture to make it bi-directionalwith fusion techniques. In addition, we proposeCNN with Self Attention network calledNeighbor Contextual Information Projector usingMulti Head Attention (NCIPMA) architecture. These architectures beat state of the art inopen source datasets like ATIS, SNIPS.
pdf
bib
abs
Unified Multi Intent Order and Slot Prediction using Selective Learning Propagation
Bharatram Natarajan
|
Priyank Chhipa
|
Kritika Yadav
|
Divya Verma Gogoi
Natural Language Understanding (NLU) involves two important task namely Intent Determination(ID) and Slot Filling (SF). With recent advancements in Intent Determination and Slot Filling tasks, explorations on handling of multiple intent information in a single utterance is increasing to make the NLU more conversation-based rather than command execution-based. Many have proven this task with huge multi-intent training data. In addition, lots of research have addressed multi intent problem only. The problem of multi intent also poses the challenge of addressing the order of execution of intents found. Hence, we are proposing a unified architecture to address multi-intent detection, associated slotsdetection and order of execution of found intents using low proportion multi-intent corpusin the training data. This architecture consists of Multi Word Importance relation propagator using Multi-Head GRU and Importance learner propagator module using self-attention. This architecture has beaten state-of-the-art by 2.58% on the MultiIntentData dataset.
pdf
bib
abs
EmpLite: A Lightweight Sequence Labeling Model for Emphasis Selection of Short Texts
Vibhav Agarwal
|
Sourav Ghosh
|
Kranti Ch
|
Bharath Challa
|
Sonal Kumari
|
Harshavardhana
|
Barath Raj Kandur Raja
Word emphasis in textual content aims at conveying the desired intention by changing the size, color, typeface, style (bold, italic, etc.), and other typographical features. The emphasized words are extremely helpful in drawing the readers’ attention to specific information that the authors wish to emphasize. However, performing such emphasis using a soft keyboard for social media interactions is time-consuming and has an associated learning curve. In this paper, we propose a novel approach to automate the emphasis word detection on short written texts. To the best of our knowledge, this work presents the first lightweight deep learning approach for smartphone deployment of emphasis selection. Experimental results show that our approach achieves comparable accuracy at a much lower model size than existing models. Our best lightweight model has a memory footprint of 2.82 MB with a matching score of 0.716 on SemEval-2020 public benchmark dataset.
pdf
bib
abs
Named Entity Popularity Determination using Ensemble Learning
Vikram Karthikeyan
|
B Shrikara Varna
|
Amogha Hegde
|
Govind Satwani
|
Shambhavi B R
|
Jayarekha P
|
Ranjan Samal
Determining the popularity of a Named Entity after completion of Named Entity Recognition (NER) task finds many applications. This work studies Named Entities of Music and Movie domains and solves the problem considering relevant 11 features. Decision Trees and Random Forests approaches were applied on the dataset and the latter algorithm resulted in acceptable accuracy.
pdf
bib
abs
Optimized Web-Crawling of Conversational Data from Social Media and Context-Based Filtering
Annapurna P Patil
|
Rajarajeswari Subramanian
|
Gaurav Karkal
|
Keerthana Purushotham
|
Jugal Wadhwa
|
K Dhanush Reddy
|
Meer Sawood
Building Chabot’s requires a large amount of conversational data. In this paper, a web crawler is designed to fetch multi-turn dialogues from websites such as Twitter, YouTube and Reddit in the form of a JavaScript Object Notation (JSON) file. Tools like Twitter Application Programming Interface (API), LXML Library, and JSON library are used to crawl Twitter, YouTube and Reddit to collect conversational chat data. The data obtained in a raw form cannot be used directly as it will have only text metadata such as author or name, time to provide more information on the chat data being scraped. The data collected has to be formatted for proper use case and the JSON library of python allows us to format the data easily. The scraped dialogues are further filtered based on the context of a search keyword without introducing bias and with flexible strictness of classification.
pdf
bib
abs
A character representation enhanced on-device Intent Classification
Sudeep Deepak Shivnikar
|
Himanshu Arora
|
Harichandana B S S
Intent classification is an important task in natural language understanding systems. Existing approaches have achieved perfect scores on the benchmark datasets. However they are not suitable for deployment on low-resource devices like mobiles, tablets, etc. due to their massive model size. Therefore, in this paper, we present a novel light-weight architecture for intent classification that can run efficiently on a device. We use character features to enrich the word representation. Our experiments prove that our proposed model outperforms existing approaches and achieves state-of-the-art results on benchmark datasets. We also report that our model has tiny memory footprint of ~5 MB and low inference time of ~2 milliseconds, which proves its efficiency in a resource-constrained environment.