Sijia Liu


pdf bib
MERCY: Multiple Response Ranking Concurrently in Realistic Open-Domain Conversational Systems
Sarik Ghazarian | Behnam Hedayatnia | Di Jin | Sijia Liu | Nanyun Peng | Yang Liu | Dilek Hakkani-Tur
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Automatic Evaluation (AE) and Response Selection (RS) models assign quality scores to various candidate responses and rank them in conversational setups. Prior response ranking research compares various models’ performance on synthetically generated test sets. In this work, we investigate the performance of model-based reference-free AE and RS models on our constructed response ranking datasets that mirror real-case scenarios of ranking candidates during inference time. Metrics’ unsatisfying performance can be interpreted as their low generalizability over more pragmatic conversational domains such as human-chatbot dialogs. To alleviate this issue we propose a novel RS model called MERCY that simulates human behavior in selecting the best candidate by taking into account distinct candidates concurrently and learns to rank them. In addition, MERCY leverages natural language feedback as another component to help the ranking task by explaining why each candidate response is relevant/irrelevant to the dialog context. These feedbacks are generated by prompting large language models in a few-shot setup. Our experiments show the better performance of MERCY over baselines for the response ranking task in our curated realistic datasets.

pdf bib
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
Prakhar Gupta | Yang Liu | Di Jin | Behnam Hedayatnia | Spandana Gella | Sijia Liu | Patrick Lange | Julia Hirschberg | Dilek Hakkani-Tur
Findings of the Association for Computational Linguistics: EMNLP 2023

Dialogue models are able to generate coherent and fluent responses, but they can still be challenging to control and may produce non-engaging, unsafe results. This unpredictability diminishes user trust and can hinder the use of the models in the real world. To address this, we introduce DialGuide, a novel framework for controlling dialogue model behavior using natural language rules, or guidelines. These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to generate responses that are more closely aligned with the developer’s expectations and intent. We evaluate DialGuide on three tasks in open-domain dialogue response generation: guideline selection, response generation, and response entailment verification. Our dataset contains 10,737 positive and 15,467 negative dialogue context-response-guideline triplets across two domains - chit-chat and safety. We provide baseline models for the tasks and benchmark their performance. We also demonstrate that DialGuide is effective in the dialogue safety domain, producing safe and engaging responses that follow developer guidelines.

pdf bib
PersLEARN: Research Training through the Lens of Perspective Cultivation
Yu-Zhe Shi | Shiqian Li | Xinyi Niu | Qiao Xu | Jiawen Liu | Yifan Xu | Shiyu Gu | Bingru He | Xinyang Li | Xinyu Zhao | Zijian Zhao | Yidong Lyu | Zhen Li | Sijia Liu | Lin Qiu | Jinhao Ji | Lecheng Ruan | Yuxi Ma | Wenjuan Han | Yixin Zhu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Scientific research is inherently shaped by its authors’ perspectives, influenced by various factorssuch as their personality, community, or society. Junior researchers often face challenges in identifying the perspectives reflected in the existing literature and struggle to develop their own viewpoints. In response to this issue, we introduce PersLEARN , a tool designed to facilitate the cultivation of scientific perspectives, starting from a basic seed idea and progressing to a well-articulated framework. By interacting with a prompt-based model, researchers can develop their perspectives explicitly. Our humanstudy reveals that scientific perspectives developed by students using PersLEARN exhibit a superior level of logical coherence and depth compared to those that did not. Furthermore, our pipeline outperforms baseline approaches across multiple domains of literature from various perspectives. These results suggest that PersLEARN could help foster a greater appreciation of diversity in scientific perspectives as an essential component of research training.


pdf bib
Improving Bot Response Contradiction Detection via Utterance Rewriting
Di Jin | Sijia Liu | Yang Liu | Dilek Hakkani-Tur
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Though chatbots based on large neural models can often produce fluent responses in open domain conversations, one salient error type is contradiction or inconsistency with the preceding conversation turns. Previous work has treated contradiction detection in bot responses as a task similar to natural language inference, e.g., detect the contradiction between a pair of bot utterances. However, utterances in conversations may contain co-references or ellipsis, and using these utterances as is may not always be sufficient for identifying contradictions. This work aims to improve the contradiction detection via rewriting all bot utterances to restore co-references and ellipsis. We curated a new dataset for utterance rewriting and built a rewriting model on it. We empirically demonstrate that this model can produce satisfactory rewrites to make bot utterances more complete. Furthermore, using rewritten utterances improves contradiction detection performance significantly, e.g., the AUPR and joint accuracy scores (detecting contradiction along with evidence) increase by 6.5% and 4.5% (absolute increase), respectively.

pdf bib
A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction
Yong Xie | Dakuo Wang | Pin-Yu Chen | Jinjun Xiong | Sijia Liu | Oluwasanmi Koyejo
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

More and more investors and machine learning models rely on social media (e.g., Twitter and Reddit) to gather information and predict movements stock prices. Although text-based models are known to be vulnerable to adversarial attacks, whether stock prediction models have similar vulnerability given necessary constraints is underexplored. In this paper, we experiment with a variety of adversarial attack configurations to fool three stock prediction victim models. We address the task of adversarial generation by solving combinatorial optimization problems with semantics and budget constraints. Our results show that the proposed attack method can achieve consistent success rates and cause significant monetary loss in trading simulation by simply concatenating a perturbed but semantically similar tweet.


pdf bib
Attention Neural Model for Temporal Relation Extraction
Sijia Liu | Liwei Wang | Vipin Chaudhary | Hongfang Liu
Proceedings of the 2nd Clinical Natural Language Processing Workshop

Neural network models have shown promise in the temporal relation extraction task. In this paper, we present the attention based neural network model to extract the containment relations within sentences from clinical narratives. The attention mechanism used on top of GRU model outperforms the existing state-of-the-art neural network models on THYME corpus in intra-sentence temporal relation extraction.


pdf bib
MayoNLP at SemEval 2017 Task 10: Word Embedding Distance Pattern for Keyphrase Classification in Scientific Publications
Sijia Liu | Feichen Shen | Vipin Chaudhary | Hongfang Liu
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we present MayoNLP’s results from the participation in the ScienceIE share task at SemEval 2017. We focused on the keyphrase classification task (Subtask B). We explored semantic similarities and patterns of keyphrases in scientific publications using pre-trained word embedding models. Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance patterns based on labeled keyphrases, is proposed as an incremental feature set to enhance the conventional Named Entity Recognition feature sets. Support vector machine is used as the supervised classifier for keyphrase classification. Our system achieved an overall F1 score of 0.67 for keyphrase classification and 0.64 for keyphrase classification and relation detection.