SpanPredict: Extraction of Predictive Document Spans with Neural Attention
Vivek Subramanian | Matthew Engelhard | Sam Berchuck | Liqun Chen | Ricardo Henao | Lawrence Carin
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
In many natural language processing applications, identifying predictive text can be as important as the predictions themselves. When predicting medical diagnoses, for example, identifying predictive content in clinical notes not only enhances interpretability, but also allows unknown, descriptive (i.e., text-based) risk factors to be identified. We here formalize this problem as predictive extraction and address it using a simple mechanism based on linear attention. Our method preserves differentiability, allowing scalable inference via stochastic gradient descent. Further, the model decomposes predictions into a sum of contributions of distinct text spans. Importantly, we require only document labels, not ground-truth spans. Results show that our model identifies semantically-cohesive spans and assigns them scores that agree with human ratings, while preserving classification performance.
Methods for Numeracy-Preserving Word Embeddings
Dhanasekar Sundararaman | Shijing Si | Vivek Subramanian | Guoyin Wang | Devamanyu Hazarika | Lawrence Carin
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Word embedding models are typically able to capture the semantics of words via the distributional hypothesis, but fail to capture the numerical properties of numbers that appear in the text. This leads to problems with numerical reasoning involving tasks such as question answering. We propose a new methodology to assign and learn embeddings for numbers. Our approach creates Deterministic, Independent-of-Corpus Embeddings (the model is referred to as DICE) for numbers, such that their cosine similarity reflects the actual distance on the number line. DICE outperforms a wide range of pre-trained word embedding models across multiple examples of two tasks: (i) evaluating the ability to capture numeration and magnitude; and (ii) to perform list maximum, decoding, and addition. We further explore the utility of these embeddings in downstream tasks, by initializing numbers with our approach for the task of magnitude prediction. We also introduce a regularization approach to learn model-based embeddings of numbers in a contextual setting.
- Lawrence Carin 2
- Dhanasekar Sundararaman 1
- Shijing Si 1
- Guoyin Wang 1
- Devamanyu Hazarika 1
- show all...