2024
pdf
bib
abs
STAGE: Simplified Text-Attributed Graph Embeddings using Pre-trained LLMs
Aaron Zolnai-Lucas
|
Jack Boylan
|
Chris Hokamp
|
Parsa Ghaffari
Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024)
We present STAGE, a straightforward yet effective method for enhancing node features in Graph Neural Network (GNN) models that encode Text-Attributed Graphs (TAGs). Our approach leverages Large-Language Models (LLMs) to generate embeddings for textual attributes. STAGE achieves competitive results on various node classification benchmarks while also maintaining a simplicity in implementation relative to current state-of-the-art (SoTA) techniques. We show that utilizing pre-trained LLMs as embedding generators provides robust features for ensemble GNN training, enabling pipelines that are simpler than current SoTA approaches which require multiple expensive training and prompting stages. We also implement diffusion-pattern GNNs in an effort to make this pipeline scalable to graphs beyond academic benchmarks.
2023
pdf
bib
abs
News Signals: An NLP Library for Text and Time Series
Chris Hokamp
|
Demian Ghalandari
|
Parsa Ghaffari
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)
We present an open-source Python library for building and using datasets where inputs are clusters of textual data, and outputs are sequences of real values representing one or more timeseries signals. The news-signals library supports diverse data science and NLP problem settings related to the prediction of time series behaviour using textual data feeds. For example, in the news domain, inputs are document clusters corresponding to daily news articles about a particular entity, and targets are explicitly associated real-valued timeseries: the volume of news about a particular person or company, or the number of pageviews of specific Wikimedia pages. Despite many industry and research usecases for this class of problem settings, to the best of our knowledge, News Signals is the only open-source library designed specifically to facilitate data science and research settings with natural language inputs and timeseries targets. In addition to the core codebase for building and interacting with datasets, we also conduct a suite of experiments using several popular Machine Learning libraries, which are used to establish baselines for timeseries anomaly prediction using textual inputs.
2018
pdf
bib
abs
360° Stance Detection
Sebastian Ruder
|
John Glover
|
Afshin Mehrabani
|
Parsa Ghaffari
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
The proliferation of fake news and filter bubbles makes it increasingly difficult to form an unbiased, balanced opinion towards a topic. To ameliorate this, we propose 360° Stance Detection, a tool that aggregates news with multiple perspectives on a topic. It presents them on a spectrum ranging from support to opposition, enabling the user to base their opinion on multiple pieces of diverse evidence.
2016
pdf
bib
INSIGHT-1 at SemEval-2016 Task 4: Convolutional Neural Networks for Sentiment Classification and Quantification
Sebastian Ruder
|
Parsa Ghaffari
|
John G. Breslin
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
pdf
bib
INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for Multilingual Aspect-based Sentiment Analysis
Sebastian Ruder
|
Parsa Ghaffari
|
John G. Breslin
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
pdf
bib
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
Sebastian Ruder
|
Parsa Ghaffari
|
John G. Breslin
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Towards a continuous modeling of natural language domains
Sebastian Ruder
|
Parsa Ghaffari
|
John G. Breslin
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods