Laura Biester


pdf bib
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat | Zhijing Jin | Artem Abzaliev | Laura Biester | Santiago Castro | Naihao Deng | Xinyi Gao | Aylin Ece Gunal | Jacky He | Ashkan Kazemi | Muhammad Khalifa | Namho Koh | Andrew Lee | Siyang Liu | Do June Min | Shinka Mori | Joan C. Nwatu | Veronica Perez-Rosas | Siqi Shen | Zekun Wang | Winston Wu | Rada Mihalcea
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Recent progress in large language models (LLMs) has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that “it’s all been solved.” Not surprisingly, this has, in turn, made many NLP researchers – especially those at the beginning of their careers – worry about what NLP research area they should focus on. Has it all been solved, or what remaining questions can we work on regardless of LLMs? To address this question, this paper compiles NLP research directions rich for exploration. We identify fourteen different research areas encompassing 45 research directions that require new research and are not directly solvable by LLMs. While we identify many research areas, many others exist; we do not cover areas currently addressed by LLMs, but where LLMs lag behind in performance or those focused on LLM development. We welcome suggestions for other research directions to include:


pdf bib
Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)
Laura Biester | Dorottya Demszky | Zhijing Jin | Mrinmaya Sachan | Joel Tetreault | Steven Wilson | Lu Xiao | Jieyu Zhao
Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)

pdf bib
Analyzing the Effects of Annotator Gender across NLP Tasks
Laura Biester | Vanita Sharma | Ashkan Kazemi | Naihao Deng | Steven Wilson | Rada Mihalcea
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

Recent studies have shown that for subjective annotation tasks, the demographics, lived experiences, and identity of annotators can have a large impact on how items are labeled. We expand on this work, hypothesizing that gender may correlate with differences in annotations for a number of NLP benchmarks, including those that are fairly subjective (e.g., affect in text) and those that are typically considered to be objective (e.g., natural language inference). We develop a robust framework to test for differences in annotation across genders for four benchmark datasets. While our results largely show a lack of statistically significant differences in annotation by males and females for these tasks, the framework can be used to analyze differences in annotation between various other demographic groups in future work. Finally, we note that most datasets are collected without annotator demographics and released only in aggregate form; we call on the community to consider annotator demographics as data is collected, and to release dis-aggregated data to allow for further work analyzing variability among annotators.


pdf bib
Building Location Embeddings from Physical Trajectories and Textual Representations
Laura Biester | Carmen Banea | Rada Mihalcea
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Word embedding methods have become the de-facto way to represent words, having been successfully applied to a wide array of natural language processing tasks. In this paper, we explore the hypothesis that embedding methods can also be effectively used to represent spatial locations. Using a new dataset consisting of the location trajectories of 729 students over a seven month period and text data related to those locations, we implement several strategies to create location embeddings, which we then use to create embeddings of the sequences of locations a student has visited. To identify the surface level properties captured in the representations, we propose a number of probing tasks such as the presence of a specific location in a sequence or the type of activities that take place at a location. We then leverage the representations we generated and employ them in more complex downstream tasks ranging from predicting a student’s area of study to a student’s depression level, showing the effectiveness of these location embeddings.

pdf bib
Quantifying the Effects of COVID-19 on Mental Health Support Forums
Laura Biester | Katie Matton | Janarthanan Rajendran | Emily Mower Provost | Rada Mihalcea
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health. Understanding its impact can inform strategies for mitigating negative consequences. In this work, we seek to better understand the effects of COVID-19 on mental health by examining discussions within mental health support communities on Reddit. First, we quantify the rate at which COVID-19 is discussed in each community, or subreddit, in order to understand levels of pandemic-related discussion. Next, we examine the volume of activity in order to determine whether the number of people discussing mental health has risen. Finally, we analyze how COVID-19 has influenced language use and topics of discussion within each subreddit.