Hadi Amiri

2023

pdf bib abs
Ling-CL: Understanding NLP Models through Linguistic Curricula
Mohamed Elgaar | Hadi Amiri
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We employ a characterization of linguistic complexity from psycholinguistic and language acquisition research to develop data-driven curricula to understand the underlying linguistic knowledge that models learn to address NLP tasks. The novelty of our approach is in the development of linguistic curricula derived from data, existing knowledge about linguistic complexity, and model behavior during training. Through the evaluation of several benchmark NLP datasets, our curriculum learning approaches identify sets of linguistic metrics (indices) that inform the challenges and reasoning required to address each task. Our work will inform future research in all NLP areas, allowing linguistic complexity to be considered early in the research and development process. In addition, our work prompts an examination of gold standards and fair evaluation in NLP.

pdf bib abs
Complexity-Guided Curriculum Learning for Text Graphs
Nidhi Vakil | Hadi Amiri
Findings of the Association for Computational Linguistics: EMNLP 2023

Curriculum learning provides a systematic approach to training. It refines training progressively, tailors training to task requirements, and improves generalization through exposure to diverse examples. We present a curriculum learning approach that builds on existing knowledge about text and graph complexity formalisms for training with text graph data. The core part of our approach is a novel data scheduler, which employs “spaced repetition” and complexity formalisms to guide the training process. We demonstrate the effectiveness of the proposed approach on several text graph tasks and graph neural network architectures. The proposed model gains more and uses less data; consistently prefers text over graph complexity indices throughout training, while the best curricula derived from text and graph complexity indices are equally effective; and it learns transferable curricula across GNN models and datasets. In addition, we find that both node-level (local) and graph-level (global) graph complexity indices, as well as shallow and traditional text complexity indices play a crucial role in effective curriculum learning.

pdf bib abs
HuCurl: Human-induced Curriculum Discovery
Mohamed Elgaar | Hadi Amiri
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce the problem of curriculum discovery and describe a curriculum learning framework capable of discovering effective curricula in a curriculum space based on prior knowledge about sample difficulty. Using annotation entropy and loss as measures of difficulty, we show that (i): the top-performing discovered curricula for a given model and dataset are often non-monotonic as apposed to monotonic curricula in existing literature, (ii): the prevailing easy-to-hard or hard-to-easy transition curricula are often at the risk of underperforming, and (iii): the curricula discovered for smaller datasets and models perform well on larger datasets and models respectively. The proposed framework encompasses some of the existing curriculum learning approaches and can discover curricula that outperform them across several NLP tasks.

pdf bib abs
Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach
Nidhi Vakil | Hadi Amiri
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training graph neural networks in language applications. However, existing curriculum learning approaches often employ a single criterion of difficulty in their training paradigms. In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. The model consists of a scheduling scheme which derives effective curricula by accounting for different views of sample difficulty and model competence during training. The proposed solution advances existing research in curriculum learning for graph neural networks with the ability to incorporate a fine-grained spectrum of graph difficulty criteria in their training paradigms. Experimental results on real-world link prediction and node classification tasks illustrate the effectiveness of the proposed approach.

2022

pdf bib abs
Generic and Trend-aware Curriculum Learning for Relation Extraction
Nidhi Vakil | Hadi Amiri
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We present a generic and trend-aware curriculum learning approach that effectively integrates textual and structural information in text graphs for relation extraction between entities, which we consider as node pairs in graphs. The proposed model extends existing curriculum learning approaches by incorporating sample-level loss trends to better discriminate easier from harder samples and schedule them for training. The model results in a robust estimation of sample difficulty and shows sizable improvement over the state-of-the-art approaches across several datasets.

2021

pdf bib abs
Embedding Time Differences in Context-sensitive Neural Networks for Learning Time to Event
Nazanin Dehghani | Hassan Hajipoor | Hadi Amiri
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We propose an effective context-sensitive neural model for time to event (TTE) prediction task, which aims to predict the amount of time to/from the occurrence of given events in streaming content. We investigate this problem in the context of a multi-task learning framework, which we enrich with time difference embeddings. In addition, we develop a multi-genre dataset of English events about soccer competitions and academy awards ceremonies, and their relevant tweets obtained from Twitter. Our model is 1.4 and 3.3 hours more accurate than the current state-of-the-art model in estimating TTE on English and Dutch tweets respectively. We examine different aspects of our model to illustrate its source of improvement.

pdf bib abs
Attentive Multiview Text Representation for Differential Diagnosis
Hadi Amiri | Mitra Mohtarami | Isaac Kohane
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We present a text representation approach that can combine different views (representations) of the same input through effective data fusion and attention strategies for ranking purposes. We apply our model to the problem of differential diagnosis, which aims to find the most probable diseases that match with clinical descriptions of patients, using data from the Undiagnosed Diseases Network. Our model outperforms several ranking approaches (including a commercially-supported system) by effectively prioritizing and combining representations obtained from traditional and recent text representation techniques. We elaborate on several aspects of our model and shed light on its improved performance.

2019

pdf bib abs
Neural Self-Training through Spaced Repetition
Hadi Amiri
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Self-training is a semi-supervised learning approach for utilizing unlabeled data to create better learners. The efficacy of self-training algorithms depends on their data sampling techniques. The majority of current sampling techniques are based on predetermined policies which may not effectively explore the data space or improve model generalizability. In this work, we tackle the above challenges by introducing a new data sampling technique based on spaced repetition that dynamically samples informative and diverse unlabeled instances with respect to individual learner and instance characteristics. The proposed model is specifically effective in the context of neural models which can suffer from overfitting and high-variance gradients when trained with small amount of labeled data. Our model outperforms current semi-supervised learning approaches developed for neural networks on publicly-available datasets.

pdf bib abs
Serial Recall Effects in Neural Language Modeling
Hassan Hajipoor | Hadi Amiri | Maseud Rahgozar | Farhad Oroumchian
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Serial recall experiments study the ability of humans to recall words in the order in which they occurred. The following serial recall effects are generally investigated in studies with humans: word length and frequency, primacy and recency, semantic confusion, repetition, and transposition effects. In this research, we investigate neural language models in the context of these serial recall effects. Our work provides a framework to better understand and analyze neural language models and opens a new window to develop accurate language models.

pdf bib abs
Vector of Locally Aggregated Embeddings for Text Representation
Hadi Amiri | Mitra Mohtarami
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We present Vector of Locally Aggregated Embeddings (VLAE) for effective and, ultimately, lossless representation of textual content. Our model encodes each input text by effectively identifying and integrating the representations of its semantically-relevant parts. The proposed model generates high quality representation of textual content and improves the classification performance of current state-of-the-art deep averaging networks across several text classification tasks.

2018

pdf bib abs
Spotting Spurious Data with Neural Networks
Hadi Amiri | Timothy Miller | Guergana Savova
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Automatic identification of spurious instances (those with potentially wrong labels in datasets) can improve the quality of existing language resources, especially when annotations are obtained through crowdsourcing or automatically generated based on coded rankings. In this paper, we present effective approaches inspired by queueing theory and psychology of learning to automatically identify spurious instances in datasets. Our approaches discriminate instances based on their “difficulty to learn,” determined by a downstream learner. Our methods can be applied to any dataset assuming the existence of a neural network model for the target task of the dataset. Our best approach outperforms competing state-of-the-art baselines and has a MAP of 0.85 and 0.22 in identifying spurious instances in synthetic and carefully-crowdsourced real-world datasets respectively.

pdf bib abs
Self-training improves Recurrent Neural Networks performance for Temporal Relation Extraction
Chen Lin | Timothy Miller | Dmitriy Dligach | Hadi Amiri | Steven Bethard | Guergana Savova
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

Neural network models are oftentimes restricted by limited labeled instances and resort to advanced architectures and features for cutting edge performance. We propose to build a recurrent neural network with multiple semantically heterogeneous embeddings within a self-training framework. Our framework makes use of labeled, unlabeled, and social media data, operates on basic features, and is scalable and generalizable. With this method, we establish the state-of-the-art result for both in- and cross-domain for a clinical temporal relation extraction task.

2017

pdf bib abs
Unsupervised Domain Adaptation for Clinical Negation Detection
Timothy Miller | Steven Bethard | Hadi Amiri | Guergana Savova
BioNLP 2017

Detecting negated concepts in clinical texts is an important part of NLP information extraction systems. However, generalizability of negation systems is lacking, as cross-domain experiments suffer dramatic performance losses. We examine the performance of multiple unsupervised domain adaptation algorithms on clinical negation detection, finding only modest gains that fall well short of in-domain performance.

pdf bib abs
Repeat before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks
Hadi Amiri | Timothy Miller | Guergana Savova
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a novel approach for training artificial neural networks. Our approach is inspired by broad evidence in psychology that shows human learners can learn efficiently and effectively by increasing intervals of time between subsequent reviews of previously learned materials (spaced repetition). We investigate the analogy between training neural models and findings in psychology about human memory model and develop an efficient and effective algorithm to train neural models. The core part of our algorithm is a cognitively-motivated scheduler according to which training instances and their “reviews” are spaced over time. Our algorithm uses only 34-50% of data per epoch, is 2.9-4.8 times faster than standard training, and outperforms competing state-of-the-art baselines. Our code is available at scholar.harvard.edu/hadi/RbF/.