Lis Pereira

2025

AMR-RE: Abstract Meaning Representations for Retrieval-Based In-Context Learning in Relation Extraction
Peitao Han | Lis Pereira | Fei Cheng | Wan Jou She | Eiji Aramaki
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)

Existing in-context learning (ICL) methods for relation extraction (RE) often prioritize language similarity over structural similarity, which may result in overlooking entity relationships. We propose an AMR-enhanced retrieval-based ICL method for RE to address this issue. Our model retrieves in-context examples based on semantic structure similarity between task inputs and training samples. We conducted experiments in the supervised setting on four standard English RE datasets. The results show that our method achieves state-of-the-art performance on three datasets and competitive results on the fourth. Furthermore, our method outperforms baselines by a large margin across all datasets in the more demanding unsupervised setting.

pdf bib abs

EmplifAI: a Fine-grained Dataset for Japanese Empathetic Medical Dialogues in 28 Emotion Labels
Wan Jou She | Lis Pereira | Fei Cheng | Sakiko Yahata | Panote Siriaraya | Eiji Aramaki
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

This paper introduces EmplifAI, a Japanese empathetic dialogue dataset designed to support patients coping with chronic medical conditions. They often experience a wide range of positive and negative emotions (e.g., hope and despair) that shift across different stages of disease management. EmplifAI addresses this complexity by providing situation-based dialogues grounded in 28 fine-grained emotion categories, adapted and validated from the GoEmotions taxonomy. The dataset includes 280 medically contextualized situations and 4,125 two-turn dialogues, collected through crowdsourcing and expert review.To evaluate emotional alignment in empathetic dialogues, we assessed model predictions on situation–dialogue pairs using BERTScore across multiple large language models (LLMs), achieving F1 scores of ≤ 0.83. Fine-tuning a baseline Japanese LLM (LLM-jp-3.1-13b-instruct4) with EmplifAI resulted in notable improvements in fluency, general empathy, and emotion-specific empathy. Furthermore, we compared the scores assigned by LLM-as-a-Judge and human raters on dialogues generated by multiple LLMs to validate our evaluation pipeline and discuss the insights and potential risks derived from the correlation analysis.

2024

pdf bib abs

Prior Knowledge-Guided Adversarial Training
Lis Pereira | Fei Cheng | Wan Jou She | Masayuki Asahara | Ichiro Kobayashi
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)

We introduce a simple yet effective Prior Knowledge-Guided ADVersarial Training (PKG-ADV) algorithm to improve adversarial training for natural language understanding. Our method simply utilizes task-specific label distribution to guide the training process. By prioritizing the use of prior knowledge of labels, we aim to generate more informative adversarial perturbations. We apply our model to several challenging temporal reasoning tasks. Our method enables a more reliable and controllable data training process than relying on randomized adversarial perturbation. Albeit simple, our method achieved significant improvements in these tasks. To facilitate further research, we will release the code and models.

pdf bib abs

QA-based Event Start-Points Ordering for Clinical Temporal Relation Annotation
Seiji Shimizu | Lis Pereira | Shuntaro Yada | Eiji Aramaki
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Temporal relation annotation in the clinical domain is crucial yet challenging due to its workload and the medical expertise required. In this paper, we propose a novel annotation method that integrates event start-points ordering and question-answering (QA) as the annotation format. By focusing only on two points on a timeline, start-points ordering reduces ambiguity and simplifies the relation set to be considered during annotation. QA as annotation recasts temporal relation annotation into a reading comprehension task, allowing annotators to use natural language instead of the formalisms commonly adopted in temporal relation annotation. Based on our method, most of the relations in a document are inferable from a significantly smaller number of explicitly annotated relations, showing the efficiency of our proposed method. Using these inferred relations, we develop a temporal relation classification model that achieves a 0.72 F1 score. Also, by decomposing the annotation process into QA generation and QA validation, our method enables collaboration among medical experts and non-experts. We obtained high inter-annotator agreement (IAA) scores, which indicate the positive prospect of such collaboration in the annotation process. Our annotated corpus, annotation tool, and trained model are publicly available: https://github.com/seiji-shimizu/qa-start-ordering.

2022

pdf bib abs

OCHADAI at SemEval-2022 Task 2: Adversarial Training for Multilingual Idiomaticity Detection
Lis Pereira | Ichiro Kobayashi
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

We propose a multilingual adversarial training model for determining whether a sentence contains an idiomatic expression. Given that a key challenge with this task is the limited size of annotated data, our model relies on pre-trained contextual representations from different multi-lingual state-of-the-art transformer-based language models (i.e., multilingual BERT and XLM-RoBERTa), and on adversarial training, a training method for further enhancing model generalization and robustness. Without relying on any human-crafted features, knowledgebase, or additional datasets other than the target datasets, our model achieved competitive results and ranked 6thplace in SubTask A (zero-shot) setting and 15thplace in SubTask A (one-shot) setting

2021

pdf bib abs

Posterior Differential Regularization with f-divergence for Improving Model Robustness
Hao Cheng | Xiaodong Liu | Lis Pereira | Yaoliang Yu | Jianfeng Gao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of f-divergences and characterize the overall framework in terms of the Jacobian matrix. Empirically, we compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model generalization. For both fully supervised and semi-supervised settings, we show that regularizing the posterior difference with f-divergence can result in well-improved model robustness. In particular, with a proper f-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for enhancing NLP model robustness.

pdf bib abs

OCHADAI at SMM4H-2021 Task 5: Classifying self-reporting tweets on potential cases of COVID-19 by ensembling pre-trained language models
Ying Luo | Lis Pereira | Kobayashi Ichiro
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

Since the outbreak of coronavirus at the end of 2019, there have been numerous studies on coro- navirus in the NLP arena. Meanwhile, Twitter has been a valuable source of news and a pub- lic medium for the conveyance of information and personal expression. This paper describes the system developed by the Ochadai team for the Social Media Mining for Health Appli- cations (SMM4H) 2021 Task 5, which aims to automatically distinguish English tweets that self-report potential cases of COVID-19 from those that do not. We proposed a model ensemble that leverages pre-trained represen- tations from COVID-Twitter-BERT (Müller et al., 2020), RoBERTa (Liu et al., 2019), and Twitter-RoBERTa (Glazkova et al., 2021). Our model obtained F1-scores of 76% on the test set in the evaluation phase, and 77.5% in the post-evaluation phase.

pdf bib abs

Targeted Adversarial Training for Natural Language Understanding
Lis Pereira | Xiaodong Liu | Hao Cheng | Hoifung Poon | Jianfeng Gao | Ichiro Kobayashi
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding. The key idea is to introspect current mistakes and prioritize adversarial training steps to where the model errs the most. Experiments show that TAT can significantly improve accuracy over standard adversarial training on GLUE and attain new state-of-the-art zero-shot results on XNLI. Our code will be released upon acceptance of the paper.

pdf bib

ALICE++: Adversarial Training for Robust and Effective Temporal Reasoning
Lis Pereira | Fei Cheng | Masayuki Asahara | Ichiro Kobayashi
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

2020

pdf bib abs

Adversarial Training for Commonsense Inference
Lis Pereira | Xiaodong Liu | Fei Cheng | Masayuki Asahara | Ichiro Kobayashi
Proceedings of the 5th Workshop on Representation Learning for NLP

We apply small perturbations to word embeddings and minimize the resultant adversarial risk to regularize the model. We exploit a novel combination of two different approaches to estimate these perturbations: 1) using the true label and 2) using the model prediction. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model boosts the fine-tuning performance of RoBERTa, achieving competitive results on multiple reading comprehension datasets that require commonsense inference.

2017

pdf bib abs

Lexical Simplification with the Deep Structured Similarity Model
Lis Pereira | Xiaodong Liu | John Lee
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We explore the application of a Deep Structured Similarity Model (DSSM) to ranking in lexical simplification. Our results show that the DSSM can effectively capture fine-grained features to perform semantic matching when ranking substitution candidates, outperforming the state-of-the-art on two standard datasets used for the task.

2015

pdf bib

Collocation Assistant for Learners of Japanese as a Second Language
Lis Pereira | Yuji Matsumoto
Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications

2014

pdf bib

Identifying collocations using cross-lingual association measures
Lis Pereira | Elga Strafella | Kevin Duh | Yuji Matsumoto
Proceedings of the 10th Workshop on Multiword Expressions (MWE)

pdf bib abs

Collocation or Free Combination? — Applying Machine Translation Techniques to identify collocations in Japanese
Lis Pereira | Elga Strafella | Yuji Matsumoto
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This work presents an initial investigation on how to distinguish collocations from free combinations. The assumption is that, while free combinations can be literally translated, the overall meaning of collocations is different from the sum of the translation of its parts. Based on that, we verify whether a machine translation system can help us perform such distinction. Results show that it improves the precision compared with standard methods of collocation identification through statistical association measures.