Elevating Code-mixed Text Handling through Auditory Information of Words

With the growing popularity of code-mixed data, there is an increasing need for better handling of this type of data, which poses a number of challenges, such as dealing with spelling variations, multiple languages, different scripts, and a lack of resources. Current language models face difficulty in effectively handling code-mixed data as they primarily focus on the semantic representation of words and ignore the auditory phonetic features. This leads to difficulties in handling spelling variations in code-mixed text. In this paper, we propose an effective approach for creating language models for handling code-mixed textual data using auditory information of words from SOUNDEX. Our approach includes a pre-training step based on masked-language-modelling, which includes SOUNDEX representations (SAMLM) and a new method of providing input data to the pre-trained model. Through experimentation on various code-mixed datasets (of different languages) for sentiment, offensive and aggression classification tasks, we establish that our novel language modeling approach (SAMLM) results in improved robustness towards adversarial attacks on code-mixed classification tasks. Additionally, our SAMLM based approach also results in better classification results over the popular baselines for code-mixed tasks. We use the explainability technique, SHAP (SHapley Additive exPlanations) to explain how the auditory features incorporated through SAMLM assist the model to handle the code-mixed text effectively and increase robustness against adversarial attacks \footnote{Source code has been made available on \url{https://github.com/20118/DefenseWithPhonetics}, \url{https://www.iitp.ac.in/~ai-nlp-ml/resources.html\#Phonetics}}.


Introduction
The proliferation of code-mixed content on social media platforms among multilingual commu-nities around the globe has been widely observed in recent years.It has been established that handling code-mixed content for information retrieval or classification poses a unique set of challenges.These challenges become even more prominent when a language is written in a different script during code-mixing.Since there are no formal spelling standards for a word in a different script, there can be large variations in spellings (Eg: ' ' (yes) in Hindi can be written as 'haan ', 'haa', 'ha' etc.).These spelling variations depend on many sociocultural factors, such as dialect, accent, and region (Crystal, 1987).It has been noted that a significant portion of the code-mixed content present on social media platforms is Romanized, which presents a challenge in terms of processing and analysis due to the lack of following a standardized Romanization method.This lack of standard leads to many complexities and is one of the major roadblocks in training a reliable and robust code-mixed NLP system (Chittaranjan et al., 2014;Vyas et al., 2014).Managing such variations within text data are typically achieved through pre-processing techniques, such as data augmentation and normalization (Kusampudi et al., 2021), which necessitates the utilization of human-annotated dictionaries and can entail a significant investment in manual annotation efforts.It has been observed that traditional techniques for processing and analysis of code-mixed content may prove ineffective in cases where the spelling of a word varies from those present in the corpus or dictionary (Das et al., 2022).
Although transformer based pre-trained models (Devlin et al., 2018;Liu et al., 2019b) have proven to be largely effective for most of the tasks in Natural Language Processing (NLP) (Mamta et al., 2022;Sun et al., 2019), it has been shown that even such models are not robust enough to handle small perturbations in spelling (Das et al., 2022).Such perturbations have been used to perform adversarial attacks on even the transformers based language models.Adversarial attack entails making small human imperceptible perturbations to the input to mislead the models.The first study in this direction proposed three adversarial attacks based on phonetic perturbations to test the limits of a codemixed text classifier.In this study, it was found that the BERT (Devlin et al., 2018) model was vulnerable to such phonetic perturbations.Van Orden (1987) found that phonetically similar spelling variations of a word are often imperceptible to humans.For example, words acha (meaning 'okay'), acchha, and achha have similar sounds when spoken.These properties of words are known as SMS property (similar sound, similar meaning, different spellings) (Le et al., 2022).
In this paper, we focus on incorporating the auditory phonetic (AP) features of words along with their semantic features in language models.We hypothesize that a model trained by utilizing these features would be agnostic to subtle spelling variations.These variations are often found to be the Achilles heel of deep learning systems and such variations are exploited during adversarial attacks.Incorporating these features would also lead to building better and more robust classifiers for code-mixed input.
To obtain the AP features, we utilize the SOUNDEX algorithm (Stephenson, 1980).This algorithm encodes the SMS property of words.In this encoding, the words acha (ok), achha, and acchha have the same encoding vectors (A200).
To embed these phonetic properties, we propose two novel language modeling approaches named SOUNDEX Language Modelling (SMLM) and SOUNDEX Aligned Masked Language Modeling (SAMLM) that are able to map between the semantic and auditory properties of words in a text.We use these approaches to pre-train BERT and RoBERTa models.We then fine-tune our pretrained models on downstream classification tasks based on code-mixed Hinglish (Hindi+English) and Benglish (Bengali+English) datasets.We perform phonetic perturbation-based attacks following Das et al. (2022) and find that our SMLM and SAMLM pre-trained models are more robust to such adversarial attacks.We observe a lower drop in performance in both models when compared to the base BERT and RoBERTa models after the attack.Additionally, we also observe a improvement in classification scores on the downstream tasks of code-mixed text classification in both languages.
However, these models lack transparency, which makes it difficult to understand their actual decision process.Hence, we exploit the model explainability to analyze the decision process of our models by extracting the terms responsible for the final prediction.For this purpose, the explainability technique, SHAP (SHapley Additive exPlanations) (Lundberg and Lee, 2017) is used.To the best of our knowledge, this is the very first attempt towards utilizing AP properties to enhance the robustness of models while dealing with code-mixed datasets.The key contributions of this work are as follows:

Related Work
Transformers-based pre-trained models have achieved remarkable success in a wide range of NLP tasks (Li et al., 2019;Raffel et al., 2020;Mamta and Ekbal, 2023a).However, several studies have shed light on vulnerabilities of these models (Sun et al., 2020).Jin et al. (2020) propose a black-box algorithm to attack BERT model with the help of closet synonyms.But it can lead to unnatural sentences because the synonym may not fit the context of the sentence.To overcome this limitation, authors in (Garg and Ramakrishnan, 2020;Li et al., 2020;Mondal, 2021;Mamta and Ekbal, 2022) proposed to use a masked language model (BERT or RoBERTa) for replacements or insertions.
There are numerous studies to enhance adversarial robustness using data augmentation, adversarial training (Morris et al., 2020), etc.Data augmentation requires manual human efforts and adversarial training requires re-training models on adversarial data which is costly.However, all these attempts are for the high-resource English language except Mamta and Ekbal (2022).
Increasing phenomena of code-mixing on social media platforms have also motivated researchers to analyze the adversarial robustness of code-mixed models.Authors in (Das et al., 2022) exposed the vulnerability of code-mixed classifiers by performing an adversarial attack based on subword perturbations, character repetition, and word language change.However, there is no attempt to enhance the adversarial robustness of code-mixed text classifiers against these perturbations.This motivated us to develop a robust model to handle adversarial perturbations for code-mixed text.
Researchers analyzed the behaviour of pretrained language models (PMLM) for different languages and attempted to enhance their performance on the downstream tasks.For example, Hande et al. (2021) conducted experiments on Tamil, Kannada, and Malayalam scripts and observed that multilingual models perform better than monolingual models.Mamta and Ekbal (2023b) proposed a multilingual framework to fine-tune BERT in shared private fashion to transfer knowledge between code-mixed and English languages.Rathnayake et al. (2022) performs adapter-based fine-tuning of PMLMs for code-mixed text classification.However, their focus is not on handling phonetic perturbations based adversarial attacks.There are a few attempts to enrich the representation of pre-trained models like BERT in the speech domain.For example, Sundararaman et al. (2021) proposed a BERT-style language model, referred to as PhonemeBERT that learns a joint language model with phoneme sequence and Audio Speech Recognition (ASR) errors to learn phonetic-aware representations that are robust to ASR errors.They introduced noise to speech (noise related to door opening, aircaraft, etc.) and handled them using phoneme sequences.However, our task is different from above in the following aspects: (i).our focus is to enhance adversarial robustness of code-mixed classifiers against adversarial attacks; (ii).our proposed approach is tuned to handle textual perturbations in code-mixed data rather than perturbations in speech signals.

Threat Model
Our target models are BERT and RoBERTa based code-mixed text classifiers due to their huge suc-cess in many NLP tasks (Liu et al., 2019a;Xu et al., 2019).An adversary attempts to mislead the target models by generating adversarial samples to make wrong classification decision.Adversary's goal: Given an input sentence S, consisting of n tokens w 1 , w 2 , w 3 , . . ., w n , with ground truth label y, and a target model M (S) = y, the goal of the adversary is to perform an untargeted attack, i.e., find an adversarial sample S adv , causing M to perform misclassification, i.e., M (S)! = y.Adversaries attack the model using phonetic perturbations in line with the prior work Das et al. (2022).Design goals: Based on the aforementioned adversary model, our proposed framework (SMLM and SAMLM) must meet the robustness and accuracy requirements.
• Robustness: SMLM and SAMLM should be robust to adversarial perturbations.They should correctly classify the adversarial samples generated by the adversary.• Accuracy: SMLM and SAMLM should handle the spelling variations in real code-mixed datasets.As a result, accuracy on actual codemixed test sets should increase.

Methodology
Our objective is to equip the pre-trained models to increase their robustness against adversarial attacks and handle phonetic spelling variations in code-mixed datasets.The detailed flow of our proposed approach is shown in Figure 1.There are 3 main components, viz.pre-training, fine-tuning, and model explainability.First, we pre-train the models (BERT and RoBERTa) to incorporate auditory features, followed by task-specific fine-tuning.Finally, the model explainability component explains the decision process of our proposed approach and illustrate the effectiveness of our proposed approach.It analyzes how the adversarial attacks and phonetic spelling variations are handled by our proposed models qualitatively.SOUNDEX Algorithm To encode the sound of a word, we utilize the case-insensitive SOUNDEX algorithm (Stephenson, 1980).It indexes word based on their sound rather than their spelling.To assign sound encoding to a given word, SOUNDEX first retains the initial character followed by the removal of all vowels.It then maps the remaining characters one by one to a digit with the help of predefined rules.In this manner, SOUNDEX assigns the same encoding (A200) to different variations acha (good), acchha, and acchha.However, in code-mixed language, same word might have different meanings in two or more languages.For example, Hindi word yar (friend) and year will share the same SOUNDEX vector (Y600).But these words have different meanings.Our proposed approach takes care of this limitation of SOUNDEX.

Pre-training
SOUNDEX Masked Language Sound Modelling (SMLM) A common denominator between spelling variations of a word is the similar AP property of the variations.Modeling this auditory property to the model would increase the model's robustness and help in better classification of codemixed text.To incorporate this property, we use SOUNDEX encoding in our language model along with the usual contextual word encoding.The SOUNDEX sequence A = {s 1 , s 2 , ..., s n } for the sentence S = {t 1 , t 2 , ..., t n } (t i is the Word-Piece token obtained by passing the sentence to the model tokenizer) is obtained and a joint input sequence IP = [t 1 , t 2 , ..., t n , [SEP ], s 1 , s 2 , ...s n ] is formed.We follow the masked-language-modeling approach proposed by Devlin et al. (2018) on the sequence.In order to train a deep bidirectional representation, we simply mask some percentage of the input tokens at random and then predict those masked tokens at the output layer of the model.The masked tokens can be either from the subsequence S or A. When a token from S is masked it predicts the word attending to both the contextual information in S and the auditory information in A. In this manner, the model would learn to predict the semantically correct word and auditory correct word.When a token from the subsequence A is masked, the model would learn to predict the auditory SOUNDEX encoding of the respective word in the input sentence.In this way, SMLM can handle the limitation of SOUNDEX.
In all of our experiments, we mask 15% Word-Piece tokens in each sequence at random.The final loss L at the output layer is given in Equation 1.
Here, N is the total number of masked tokens in the input sequence (IP in our case), x i is the masked i-th token, and p(x i |x 1 , x 2 , ..., x i−1 , x i+1 , ..., x N ) is the probability of the i-th token, conditioned on all the other tokens in the sequence.
SOUNDEX Aligned Masked Language Modelling Although SMLM incorporates auditory properties along with the semantic characteristics of the word, both of these properties might not always align.This is because the text S and auditory A sequences are appended one after the other.In each of the sequences, t i and s i can be split into multiple tokens (during WordPiece tokenization), making this alignment even more difficult.For better alignment between word and SOUNDEX tokens, we propose a SOUNDEX Aligned Masked Language Modelling (SAMLM).In this method, instead of appending the sequence one after the other, we make a new input sequence by interleaving the two sequences IP 1 = {t 1 , s 1 , t 2 , s 2 , ..., t n , s n }.This input sequence takes care of the alignment of auditory tokens with the word tokens, which would ensure more robustness in the model in case of adversarial attacks and natural spelling variations in the code-mixed text.In addition, SAMLM's semantic alignment can take care of the limitation of SOUNDEX more effectively.

Fine-tuning
Once the model (BERT or RoBERTa) is pretrained using our proposed approaches (SMLM and SAMLM), the model is fine-tuned for the downstream classification tasks.
For models trained with the SMLM approach, we created the input sequence IP smlm = {[CLS], t 1 , t 2 , ..., t n , [SEP ], s 1 , s 2 , ..., s n }.Similarly for models trained with the SAMLM approach, the prepared input sequence is IP samlm = {[CLS], t 1 , s 1 , t 2 , s 2 ..., t n , s n }.This input sequence is passed to the model and from the prefinal layer of the model [CLS] representation is fed into an output layer for the classification tasks.

Model Explainability
Model explainability component is introduced to understand how auditory features help the model in improving its robustness and accuracy.We use Shapely algorithm to determine the relevance of each word in a given sentence, against the target model (BERT and RoBERTa).It calculates the relevance score (known as Shapley value) for each word based on possible coalitions of words for a particular prediction (Lundberg and Lee, 2017)2 .
We create an explicit word masker which tokenizes the sentence into fragments containing words, and which is then used to mask words in SHAP (here mask refers to hiding a particular word from the sentence).The input sentence along with the designed masker is passed to SHAP which generates various masked combinations of the sentence.These masked sentence fragments are further passed to the model tokenizer.We further concatenate the SOUNDEX encoding to the masked combinations for better prediction scores as shown in Figure 1.This concatenation further helps Shapley to compute the relevance scores of words based on semantic and auditory features.Both the model tokenizers (BERT and RoBERTa) convert the words to subwords and generate input, segment, and mask embeddings for each subword unit, and generate the final representation by performing a summation of all three embeddings (Devlin et al., 2018).Finally, this combined representation of these vectors for each masked version is passed to the target model to obtain the output probabilities, which are further returned to SHAP

Experimental Setup and Results
We use BERT-base and RoBERTa-base as target models for each task.To access the experimental evaluation of our proposed approach, we conduct extensive experiments on code-mixed Hinglish and Benglish language datasets.For Hinglish, we conduct experiments on two benchmark datasets related to offensive (Mathur et al., 2018) and sentiment analysis (Joshi et al., 2016).For Benglish, we conduct experiments on aggression analysis data (Bhattacharya et al., 2020).Similarly, for pretraining task, we use a total of 33,014 Hinglish sentences and 6,149 Benglish sentences.3

Baselines
Vanilla classifiers (VC): We fine-tune the vanilla BERT and RoBERTa (henceforth referred to as VCBERT and VCRoBERTa) pre-trained models on the downstream tasks with only the word sequences as input.
Vanilla Masked Language Modelling pretrained classifiers (VMLM): We pre-train BERT and RoBERTa on real code-mixed Hinglish/Benglish datasets (We henceforth refer to these pre-trained models as VMLMBERT and VMLMRoBERTa).After pre-training, the model is fine-tuned on their respective language datasets for the downstream tasks.During fine-tuning, only word sequences are considered as input.
PhoneMLM classifiers: We pre-train BERT and RoBERTa on word and phoneme sequences.Phoneme sequences are appended at the end of word sequence separated by '[SEP]' token following Sundararaman et al. (2021).Next, each model is fine-tuned on the downstream task using both word and phoneme sequences as input.Phoneme sequences are generated using Phonemizer tool4 .
SMLM Classifiers: BERT and RoBERTa models are pre-trained on words and the corresponding SOUNDEX vectors.Each model is then fine-tuned on the downstream classification task.
SAMLM Classifiers: BERT and RoBERTa models are pre-trained on words and the corresponding SOUNDEX vectors using the SAMLM strategy.Each model is then fine-tuned on the downstream classification task.

Experimental Results
We define the following two setups for the evaluation of our proposed approaches: (i).Robustness evaluation on adversarial test sets; (ii).Performance evaluation on the original test sets.We use accuracy and F1 scores to evaluate the performance on original test sets.For adversarial robustness evaluation, we use the following metrics: • Before-attack-accuracy (BA) and after-attackaccuracy (AA): BA is calculated on the original test sets and AA score is calculated on the adversarial test set.• Before-attack-F1 (BF1) and after-attack-F1 (AF1): Before-attack-F1 score is calculated on the original test sets and the after-attack-F1 score is computed on the adversarial test set.• Perturbation ratio (PR): The ratio of words perturbed in the sentence to the total number of words in the sentence.• Percentage drop in accuracy (PDA): PDA is calculated as BA−AA BA .

Evaluation on Adversarial Test Sets
We calculate the AA and AF1 which correspond to the accuracy and F1 scores calculated on the adversarial test sets.
Generating Adversarial Attack Samples: To test the effectiveness of our proposed approach in improving the adversarial robustness of pre-trained models, we execute the black box attack following Das et al. ( 2022) on BERT and RoBERTa models.The attack is performed using sub-word perturbations.It makes use of a pre-existing dictionary of character groups (unigrams, bigrams, and trigrams) that can be replaced by phonetically similar character groups.To apply these perturbations, we first identify the important tokens, using the leave-oneout method, and then replace the important tokens with the corresponding other character groups from the dictionary.The aforementioned steps are repeated until the attack is successful.The Bengali words in our Benglish dataset consist of a mix of Romanized and Bengali script words.Since there is no dictionary available for Bengali (script) character groups for adversarial attacks, we could not perform attacks on the Bengali code-mixed dataset.
Evaluation Results: We define the following two setups: (i).generate attack samples by attacking the VCBERT and VCRoBERTa models and evaluate the performance on all the other models; (ii).attack individual models by generating differ- It is interesting to note that even though PhoneMLM is better than the original pre-trained models, it is not always better than VMLM where the model is simply pre-trained on the code-mixed dataset (c.f.Table 1 Offensive task).In contrast, both our proposed pre-training steps SMLM and SAMLM prove to be more robust than all the other baselines in all the tasks across the two code-mixed languages.Setup 2 also demonstrates that our proposed SMLM and SAMLM are more resistant to adversarial attacks compared to all the other models as illustrated by AA and AF1.These results establish the fact that leveraging SOUNDEX encoding increases the robustness of BERT and RoBERTa models against adversarial attacks.We observe that gains for AA are smaller in setup 1 for our proposed approaches compared to setup 2 (except for offensive task).It is because, for setup 1, the attack is executed on VC models according to the token importance of VC models.It might be possible that in some of the cases, the focus of the VC model might be on different tokens compared to other models (in neutral instances).In this case, perturbations will not affect the output of other models to a larger extent.

Performance Evaluation on the Original Test Sets:
We evaluate the effectiveness of our proposed approaches on the original test sets of Hinglish and Benglish languages.Results of Hinglish are presented in Table 2 (BA and BF1).Our proposed pretraining approaches (SMLM and SAMLM) results in the improvement in classification tasks across the two code-mixed languages.In case of Hinglish sentiment and offensive classification tasks, the SAMLM pre-trained BERT model gives the best scores.Interestingly in Benglish dataset (c.f.Table 3) aggression classification task our SMLM pretraining results in better classification.This may be because our Benglish code-mixed data consists of Bengali script words along with Romanized Bengali words.The SOUNDEX algorithm is unable to produce sound encodings for such words.Since SAMLM interleaves words and sound encodings, there are randomly missing sound encodings in a sequence that negatively affects alignment.SMLMon the other hand-is not severely affected by the missing sound encoding as it does not explicitly align the word and sound encoding sequences.We follow a paired T-test (significance test), which validates the performance gain over the baselines is significant with 95% confidence (p-value<0.05).
We observe that the gain over BA and BF1 is incremental whereas gain in AA and AF1 is larger.It is due to the fact that BA and BF1 are calculated on the original test sets and due to the small number of spelling variations in the original test set, gain is incremental.However, in the case of AA and AF1, there are more spelling variations in the adversarial test set.Larger gain in AA and AF1 illustrates the fact that our proposed approaches has the potential to handle these phonetic perturbations compared to other baselines.More experiments are present in Appendix B.

Qualitative Analysis
This section analyzes the actual decision process of the proposed framework for the classification tasks by extracting the terms responsible for predicting the final output class.We explain the behaviour of different BERT-base models for Hinglish sentiment dataset.

Explaining Adversarial Robustness
In this section, we explain how the auditory features help the model in improving its robustness5 .
Figure 2 shows example of the Hinglish sentiment dataset where the predictions of all the models are affected due to the adversarial attack, but our model is robust for this attack.Tokens with red colour signify the terms which are responsible for the final label prediction (positive SHAP scores).In contrast, the words with blue colour negatively influence the final prediction (negative SHAP scores).More intense colour signifies the greater influence of the term for the final prediction.In Figure 2, the actual label of the sentence is negative.Applying adversarial perturbation to the original example results in a successful attack against the VCBERT model, VMLM, and PhoneMLM models.However, SOUNDEX encoding helps SMLM and SAMLM to defend against this adversarial attack.Figure 2 reveals that in the case of the original example, words musalman (muslim), bad (after), movie, flop, etc. are contributing positively for negative sentiment prediction.However, words bhai (brother), frnz (friends), etc. contribute negatively to the negative sentiment prediction.Adversarial attack on the VCBERT model by applying perturbations on word movie (moovee) has shifted the focus from positively contributing words to other words.This behavior results in misclassification to neutral class.On the other hand, SOUNDEX encoding helps the model to resist adversarial attack by assigning the same SOUNDEX encoding (M 100) to movie and moovee.The same SOUNDEX encoding forces the model to treat both spelling variations equally.This shows that our proposed SMLM and SAMLM are more robust to such adversarial attacks.

Explaining Text Classification
We discuss how the addition of auditory features and our pre-training mechanisms helps the classifiers in improving their performance.We show a few examples where (i). the VCBERT model does misclassification, but all the other models produce the correct classes (Figure 3); (ii).VCBERT, PhoneMLM BERT, and VMLMBERT perform misclassification but our proposed models perform correct classification (Figure 4).In Figure 3, although the VCBERT focuses on the word thuuuuu (spit) but due to the repetition of character u, it is not able to understand it.VCBERT   commits misclassification due to focus on the other words film and trailer.On the other hand, all the other models are able to capture these spelling variations and perform correct classification.Figure 4 illustrates the case where all other models misclassify but our proposed SMLM and SAMLM correctly classify.Although the focus of VCBERT, VMLMBERT, and PhoneMLM BERT models are on the nyc (spelling variation of nice), but these models are not able to identify it, and as a result it turns into a mislcassification to neutral class.Our proposed approaches, SMLM and SAMLM perform correct classification by assigning the same encoding to nice and nyc (N200).This illustrates the effectiveness of SOUNDEX encoding and our proposed SMLM and SAMLM pretraining in capturing the spelling variations more effectively than the baselines.More detailed analysis is present in Appendix C.

Error Analysis
To explain the limitations of our proposed framework, we show samples misclassified by the SMLM and SAMLM models in Table 4.Samples are taken from Hinglish language sentiment classi-Figure 2: Qualitative Analysis for adversarial attack samples on the different BERT models fication task (BERT based models).In example 1, the word wait is written as W8.Here SOUNDEX algorithm encodes it as W000 (numbers are not captured by SOUNDEX algorithm).Hence, both models randomly predict the positive sentiment.Example 2 has implicit positive sentiment which both the SMLM and SAMLM models are unable to understand, resulting in misclassifications.The VCBERT, VMLMBERT, and PhoneMLMBERT models also misclassify such samples.In example 3, SMLM and SAMLM models predict positive sentiment because both models focus on the words bhai (brother) and trust (revealed by SHAP).The presence of these words has created confusion for both models, which is the reason for misclassification.

Conclusion
In this paper, we propose two novel pre-training steps, SMLM (SOUNDEX Masked Language Modeling) and SAMLM (SOUNDEX Aligned Masked Language Modeling), to incorporate the auditory phonetic (AP) features into popular classification models, BERT and RoBERTa.Our approach effectively handles spelling perturbations, a common form of attack in code-mixed languages like Hinglish and Benglish.We perform phoneticbased attacks on models trained using our technique and find that the performance decrease is significantly less than multiple baselines.Additionally, incorporating the AP features leads to improvement in classification scores on different tasks in both Hinglish and Benglish as compared to models trained only on semantic features.In summary, the novel pre-training steps of SMLM and SAMLM provide an effective way to incorporate AP features into NLP models, leading to improved robustness and performance on code-mixed text classification tasks.
In future work, we plan to extend our approach to other code-mixed languages and evaluate its performance on more NLP tasks.We believe that our approach can have a significant impact on the robustness of NLP models, especially in the context of code-mixed languages.

Limitations
This study, like most studies, has some limitations that could be addressed in future research.Our approach does not fix the issue of implicit sentiment in sentences that are present in the corresponding baseline models.SOUNDEX does not give encoding for numeric digits resulting in the same representation for different words containing such digits.For such words, our approach would not give any boost in performance over the baselines.We have discussed such examples in Section 6.3.In addition, our proposed approach can not handle code-mixed languages written in original script.These limitations could be addressed in the future by augmenting more data of an implicit nature through the semi-supervised way and through the better encoding of auditory features.
finding the importance of every word in the sentence and then applying perturbations to important words until the attack is successful.Suppose there are n words in the sentence; then this traditional approach requires n number of queries to the trained model to calculate the importance of each word.Further, more queries are required to generate adversarial samples.In the worst case, n number of operations are required on actual example (perturbing each word of example to execute a successful attack).This will again require n queries to the trained model.This process is computationally expensive and requires M xN xN computations (M number of instances in the training step, N average token length of each instance).The existing model is further fine-tuned on these adversarial samples to make it robust.Our approach gets rid of this computationally expensive process by introducing a small pre-training step as discussed in Section 4. Both our approaches, SMLM and SAMLM do not require pre-training a model from scratch, but only require a small pre-training step (before final fine-tuning) utilizing a very few instances (33,014 Hinglish and 6,149 Benglish in our case) on the existing pre-trained language models.As a result of SMLM and SAMLM, our approach does not require re-training (adversarial training) of the classifier on adversarial test samples.

A.2 Datasets
To access the experimental evaluation of our proposed approach, we conduct extensive experiments on code-mixed Hinglish and Benglish language datasets.For Hinglish, we conduct experiments on two benchmark datasets related to offensive and sentiment analysis.For Benglish, we conduct experiments on aggression analysis data.Details of the datasets are described below: Hinglish Sentiment Analysis Dataset (Joshi et al., 2016): This dataset contains posts from some public Facebook pages popular in India.The dataset is annotated with three sentiment classes, viz., positive, negative, and neutral.It contains a total of 3,879 instances.
Hinglish Offensive Tweet (HOT) Dataset (Mathur et al., 2018): HOT dataset contains tweets crawled using Twitter Streaming API by selecting tweets having more than three Hinglish words.It is manually annotated with 3 classes, viz., non-offensive, abusive, and hate-inducing.This dataset contains a total of 3189 tweets.
Benglish Aggression Analysis Dataset (Bhattacharya et al., 2020): This dataset is collected from comments on YouTube comments and contains comments written in Bengali as well as Roman scripts.It contains 5971 comments, annotated with 3 classes of aggression, viz., overtly aggressive, covertly aggressive, and non-aggressive.
All the datasets are divided into 3 splits-train, validation, and test.The detailed statistics of all the dataset splits are shown in Table 5.

Pre-training Datasets:
• Hinglish: We pre-train the models on a total of 33,014 Hinglish sentences.We utilize the publicly available code-mixed datasets from

B More Experiments
To demonstrate the effectiveness of passing SOUNDEX vectors along with textual content, we perform experiments for setup 2 (defined in Section 5.2), which involves performance evaluation on the original test sets.Since the vanilla pre-trained models of BERT and RoBERTa do not incorporate any SOUNDEX information, fine-tuning these models with only SOUNDEX vectors would be unfair.Therefore, we experiment on SMLM and SAMLM that have a SOUNDEX component in pre-training.We pass only the SOUNDEX vectors to both the models during task fine-tuning for Hinglish and Benglish languages.Evaluation results for Hinglish and Benglish language tasks are shown in Table 6.We observe that using only SOUNDEX vectors performs inferior compared to our proposed approach, where we are passing SOUNDEX vector along with semantic features.In this case,  SOUNDEX will assign the same encoding vectors (Y600) to the Hindi word yar (friend) and English word year.These cases will add to the model's confusion, which could be the possible reason for its inferior performance.In our proposed approach, this limitation of SOUNDEX is handled by providing the word tokens along with SOUNDEX tokens at the input.

B.1 Evaluating Multilingual Models
We also perform experiments to assess the robustness of multilingual models.We perform experiments with multilingual BERT (mBERT) and IndicBERT for sentiment classification for the Hinglish language.The mBERT and InidcBERT obtain accuracy of 65.31% and 49%, respectively, for the sentiment Hinglish task.We further perform detailed experiments to assess the robustness of mBERT and IndicBERT against adversarial attacks.Adversarial attack is performed using subword perturbations as described in Section 5.2.1.

B.1.1 Evaluation Results on Adversarial Test Sets
We define two setups (similar to Section 5.2.1): (i).generate attack samples by attacking vanilla mBERT and VCIndicBERT (vanilla IndicBERT) and evaluate the performance on all other models; (ii).attack individual models by generating different adversarial samples for each model.Results for setup 1 and setup 2 are depicted in Tables 7 and 8, respectively.Similar phenomena have been observed in the case of mBERT and IndicBERT multilingual models, mirroring the observations made for BERTbase and RoBERTa-base models.We observe that mBERT and InidcBERT models are also vulnerable   general, this approach can be applied to any language where the Romanization of native script leads to spelling variations.Hindi and Bengali languages, when written in Romanized code-mixed form, produce many such spelling variations.Similarly, Punjabi belongs to the same language family, and the Romanized code-mixing form of Punjabi also induces spelling variations in the data.We perform additional experiments with Punjabi-English code-mixed language to demonstrate the generalizability capability of our proposed approach.We use the publicly available dataset for sentiment task to evaluate our model on robustness and accuracy metrics (explained in Section 3) (Yadav et al., 2020).Experimental results for Punjabi-English corresponding to setup 2 are presented in Table 9 (setup2).We observe that auditory features help Punjabi-English language to improve robustness and accuracy, similar to other language pairs.C Qualitative Analysis

C.1 Explaining Adversarial Robustness
In this section, we explain how the auditory features help the model in improving its robustness.
Figure 5 show examples of the Hinglish sentiment dataset where the predictions of VC models are affected due to the adversarial attack.In example 1 (5), replacing mai (I) with mee causes the vanilla BERT model to perform misclassification.However, all other models are robust.Figure 5 explains the decision process of all the models.Tokens with red colour signify the terms which are responsible for the final label prediction (positive SHAP scores).In contrast, the words with blue colour negatively influence the final prediction (negative SHAP scores).More intense colour signifies the greater influence of the term for the final prediction. Figure 5 reveals that for predicting the neutral sentiment for actual example 1, original BERT fo- cuses more on words mai (I) and phle (before) and words Mumbai, bhut (very) and saka (did) makes a negative impact for the neutral class classification.Changing the word mai to mee (adversarial example) shifts the focus of original BERT to other words like Mumbai, bhut (very), aap (you), etc.This shift of focus to negatively contributing words results in increasing confusion for the BERT model which is the reason for misclassification.However, MLM, PhoneMLM, SMLM and SAMLM help the BERT model to keep its focus on positively contributing words.Here, SMLM and SAMLM will assign same the encoding vector to mai and mee (M 000) which help the model to defend against adversarial attack.
Figure 6 shows example of case where an adversary is able to execute a successful attack against all the models except SMLM and SAMLM.Here, the main focus of VCBERT model is on owesome (variation of awesone) and look.After changing the word look to looook, the focus of VCBERT and VMLMBERT have shifted to bhaijan (brother), which results in misclassification to neutral class.In case of PhoneMLM, model's focus is on owsome and bhaijan (light red).However, the word looook

Figure 1 :
Figure1: Proposed process flow of our proposed methodology.The pre-training of BERT or RoBERTa is done using SMLM or SAMLM, which is fine-tuned on the downstream classification task.
Figure 3: Qualitative analysis for Hinglish text classification using BERT based models

Figure 5 :
Figure 5: Qualitative Analysis for adversarial attack samples on the different BERT models.

Table 1 :
Results of adversarial attack for setup 1 ent adversarial samples for each model.Results for setup 1 for sentiment and offensive tasks are depicted in Table1.We observe that VCBERT and VCRoBERTa are less robust to the phonetic perturbation-based adversarial attack, resulting in a large drop in accuracy and F1 scores.In contrast, VLMBERT and VLMRoBERTa are found to be more robust against adversarial attacks.This is expected since the VCBERT and VCRoBERTa are not pre-trained on code-mixed datasets.It is also observed that PhoneMLM-based pre-training (of both BERT and RoBERTa) is more robust to these adversarial attacks compared to the original pretraining.

Table 2 :
Results of adversarial attack for setup 2. Here, PR: perturbation ratio (higher the better), PDA: percentage drop in accuracy (lower the better)

Table 3 :
Results on the original test set (non-adversarial) for Benglish

Table 5 :
Data Statistics

Table 6 :
Results on the original (non-adversarial) test set for Hinglish and Benglish for BERT-based models

Table 8 :
Results of adversarial attack (generating adversarial samples against individual model).Here, PR: perturbation ratio (higher the better), PDA: percentage drop in accuracy (lower the better)