Tu Nguyen


2025

pdf bib
How Persuasive Is Your Context?
Tu Nguyen | Kevin Du | Alexander Miserlis Hoyle | Ryan Cotterell
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Two central capabilities of language models (LMs) are: (i) drawing on prior knowledge about entities, which allows them to answer queries such as What’s the official language of Austria?, and (ii) adapting to new information provided in context, e.g., Pretend the official language of Austria is Tagalog., that is pre-pended to the question. In this article, we introduce targeted persuasion score (TPS), designed to quantify how persuasive a given context is to an LM where persuasion is operationalized as the ability of the context to alter the LM’s answer to the question. In contrast to evaluating persuasiveness only through a model’s most likely answer, TPS provides a more fine-grained view of model behavior. Based on the Wasserstein distance, TPS measures how much a context shifts a model’s original answer distribution towarda target distribution. Empirically, through aseries of experiments, we show that TPS captures a more nuanced notion of persuasiveness than previously proposed metrics.

2024

pdf bib
Noise Contrastive Estimation-based Matching Framework for Low-Resource Security Attack Pattern Recognition
Tu Nguyen | Nedim Šrndić | Alexander Neth
Findings of the Association for Computational Linguistics: EACL 2024

Techniques, Tactics and Procedures (TTP) mapping is an important and difficult task in the application of cyber threat intelligence (CTI) extraction for threat reports. TTPs are typically expressed in semantic forms within security knowledge bases like MITRE ATT&CK, serving as textual high-level descriptions for sophisticated attack patterns. Conversely, attacks in CTI threat reports are detailed in a combination of natural and technical language forms, presenting a significant challenge even for security experts to establish correlations or mappings with the corresponding TTPs.Conventional learning approaches often target the TTP mapping problem in the classical multiclass/label classification setting. This setting hinders the learning capabilities of the model, due to the large number of classes (i.e., TTPs), the inevitable skewness of the label distribution and the complex hierarchical structure of the label space. In this work, we approach the problem in a different learning paradigm, such that the assignment of a text to a TTP label is essentially decided by the direct semantic similarity between the two, thus, reducing the complexity of competing solely over the large labeling space. In order that, we propose a neural matching architecture that incorporates a sampling based learn-to-compare mechanism, facilitating the learning process of the matching model despite constrained resources.