Xiaobing Sun


pdf bib
Unraveling Feature Extraction Mechanisms in Neural Networks
Xiaobing Sun | Jiaxi Li | Wei Lu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The underlying mechanism of neural networks in capturing precise knowledge has been the subject of consistent research efforts. In this work, we propose a theoretical approach based on Neural Tangent Kernels (NTKs) to investigate such mechanisms. Specifically, considering the infinite network width, we hypothesize the learning dynamics of target models may intuitively unravel the features they acquire from training data, deepening our insights into their internal mechanisms. We apply our approach to several fundamental models and reveal how these models leverage statistical features during gradient descent and how they are integrated into final decisions. We also discovered that the choice of activation function can affect feature extraction. For instance, the use of the ReLU activation function could potentially introduce a bias in features, providing a plausible explanation for its replacement with alternative functions in recent pre-trained language models. Additionally, we find that while self-attention and CNN models may exhibit limitations in learning n-grams, multiplication-based models seem to excel in this area. We verify these theoretical findings through experiments and find that they can be applied to analyze language modeling tasks, which can be regarded as a special variant of classification. Our work may offer insights into the roles and capacities of fundamental modules within deep neural networks including large language models.


pdf bib
Implicit n-grams Induced by Recurrence
Xiaobing Sun | Wei Lu
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP)tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may promptre-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on handling sequential data. Despite manyprior attempts to interpret RNNs, their internal mechanisms have not been fully understood, and the question on how exactly they capturesequential features remains largely unclear. In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena such as negation and intensification. Furthermore, we examined the efficacy of using such n-gram components alone as encoders on tasks such as sentiment analysis and language modeling, revealing they could be playing important roles in contributing to the overall performance of RNNs. We hope our findings could add interpretability to RNN architectures, and also provide inspirations for proposing new architectures for sequential data.


pdf bib
Understanding Attention for Text Classification
Xiaobing Sun | Wei Lu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Attention has been proven successful in many natural language processing (NLP) tasks. Recently, many researchers started to investigate the interpretability of attention on NLP tasks. Many existing approaches focused on examining whether the local attention weights could reflect the importance of input representations. In this work, we present a study on understanding the internal mechanism of attention by looking into the gradient update process, checking its behavior when approaching a local minimum during training. We propose to analyze for each word token the following two quantities: its polarity score and its attention score, where the latter is a global assessment on the token’s significance. We discuss conditions under which the attention mechanism may become more (or less) interpretable, and show how the interplay between the two quantities can contribute towards model performance.