Peyman Najafirad
2022
Mitigating Data Shift of Biomedical Research Articles for Information Retrieval and Semantic Indexing
Nima Ebadi
|
Anthony Rios
|
Peyman Najafirad
Proceedings of the Third Workshop on Scholarly Document Processing
Researchers have explored novel methods for both semantic indexing and information retrieval of biomedical research articles. Moreover, most solutions treat each task independently. However, both tasks are related. For instance, semantic indexes are generally used to filter results from an information retrieval system. Hence, one task can potentially improve the performance of models trained for the other task. Thus, this study proposes a unified retriever-ranker-based model to tackle the tasks of information retrieval (IR) and semantic indexing (SI). Particularly, our proposed model can adapt to rapid shifts in scientific research. Our results show that the model effectively leverages task similarity to improve the robustness to dataset shift. For SI, the Micro f1 score increases by 8% and the LCA-F score improves by 5%. For IR, the MAP increases by 5% on average.
2020
COVID-19 Surveillance through Twitter using Self-Supervised and Few Shot Learning
Brandon Lwowski
|
Peyman Najafirad
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Public health surveillance and tracking virus via social media can be a useful digital tool for contact tracing and preventing the spread of the virus. Nowadays, large volumes of COVID-19 tweets can quickly be processed in real-time to offer information to researchers. Nonetheless, due to the absence of labeled data for COVID-19, the preliminary supervised classifier or semi-supervised self-labeled methods will not handle non-spherical data with adequate accuracy. With the seasonal influenza and novel Coronavirus having many similar symptoms, we propose using few shot learning to fine-tune a semi-supervised model built on unlabeled COVID-19 and previously labeled influenza dataset that can provide in- sights into COVID-19 that have not been investigated. The experimental results show the efficacy of the proposed model with an accuracy of 86%, identification of Covid-19 related discussion using recently collected tweets.
Hate and Toxic Speech Detection in the Context of Covid-19 Pandemic using XAI: Ongoing Applied Research
David Hardage
|
Peyman Najafirad
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
As social distancing, self-quarantines, and travel restrictions have shifted a lot of pandemic conversations to social media so does the spread of hate speech. While recent machine learning solutions for automated hate and offensive speech identification are available on Twitter, there are issues with their interpretability. We propose a novel use of learned feature importance which improves upon the performance of prior state-of-the-art text classification techniques, while producing more easily interpretable decisions. We also discuss both technical and practical challenges that remain for this task.