Shraey Bhatia


2022

pdf bib
Automatic Explanation Generation For Climate Science Claims
Rui Xing | Shraey Bhatia | Timothy Baldwin | Jey Han Lau
Proceedings of the 20th Annual Workshop of the Australasian Language Technology Association

Climate change is an existential threat to humanity, the proliferation of unsubstantiated claims relating to climate science is manipulating public perception, motivating the need for fact-checking in climate science. In this work, we draw on recent work that uses retrieval-augmented generation for veracity prediction and explanation generation, in framing explanation generation as a query-focused multi-document summarization task. We adapt PRIMERA to the climate science domain by adding additional global attention on claims. Through automatic evaluation and qualitative analysis, we demonstrate that our method is effective at generating explanations.

2021

pdf bib
Automatic Classification of Neutralization Techniques in the Narrative of Climate Change Scepticism
Shraey Bhatia | Jey Han Lau | Timothy Baldwin
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Neutralisation techniques, e.g. denial of responsibility and denial of victim, are used in the narrative of climate change scepticism to justify lack of action or to promote an alternative view. We first draw on social science to introduce the problem to the community of nlp, present the granularity of the coding schema and then collect manual annotations of neutralised techniques in text relating to climate change, and experiment with supervised and semi- supervised BERT-based models.

2018

pdf bib
Topic Intrusion for Automatic Topic Model Evaluation
Shraey Bhatia | Jey Han Lau | Timothy Baldwin
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Topic coherence is increasingly being used to evaluate topic models and filter topics for end-user applications. Topic coherence measures how well topic words relate to each other, but offers little insight on the utility of the topics in describing the documents. In this paper, we explore the topic intrusion task — the task of guessing an outlier topic given a document and a few topics — and propose a method to automate it. We improve upon the state-of-the-art substantially, demonstrating its viability as an alternative method for topic model evaluation.

2017

pdf bib
An Automatic Approach for Document-level Topic Model Evaluation
Shraey Bhatia | Jey Han Lau | Timothy Baldwin
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Topic models jointly learn topics and document-level topic distribution. Extrinsic evaluation of topic models tends to focus exclusively on topic-level evaluation, e.g. by assessing the coherence of topics. We demonstrate that there can be large discrepancies between topic- and document-level model quality, and that basing model evaluation on topic-level analysis can be highly misleading. We propose a method for automatically predicting topic model quality based on analysis of document-level topic allocations, and provide empirical evidence for its robustness.

2016

pdf bib
Automatic Labelling of Topics with Neural Embeddings
Shraey Bhatia | Jey Han Lau | Timothy Baldwin
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Comparing to a state-of-the-art topic labelling system, our methodology is simpler, more efficient and finds better topic labels.