Phuong Nguyen


2023

pdf bib
StructSP: Efficient Fine-tuning of Task-Oriented Dialog System by Using Structure-aware Boosting and Grammar Constraints
Truong Do | Phuong Nguyen | Minh Nguyen
Findings of the Association for Computational Linguistics: ACL 2023

We have investigated methods utilizing hierarchical structure information representation in the semantic parsing task and have devised a method that reinforces the semantic awareness of a pre-trained language model via a two-step fine-tuning mechanism: hierarchical structure information strengthening and a final specific task. The model used is better than existing ones at learning the contextual representations of utterances embedded within its hierarchical semantic structure and thereby improves system performance. In addition, we created a mechanism using inductive grammar to dynamically prune the unpromising directions in the semantic structure parsing process. Finally, through experimentsOur code will be published when this paper is accepted. on the TOP and TOPv2 (low-resource setting) datasets, we achieved state-of-the-art (SOTA) performance, confirming the effectiveness of our proposed model.

2022

pdf bib
Complex Word Identification in Vietnamese: Towards Vietnamese Text Simplification
Phuong Nguyen | David Kauchak
Proceedings of the Workshop on Multilingual Information Access (MIA)

Text Simplification has been an extensively researched problem in English, but has not been investigated in Vietnamese. We focus on the Vietnamese-specific Complex Word Identification task, often the first step in Lexical Simplification (Shardlow, 2013). We examine three different Vietnamese datasets constructed for other Natural Language Processing tasks and show that, like in other languages, frequency is a strong signal in determining whether a word is complex, with a mean accuracy of 86.87%. Across the datasets, we find that the 10% most frequent words in many corpus can be labelled as simple, and the rest as complex, though this is more variable for smaller corpora. We also examine how human annotators perform at this task. Given the subjective nature, there is a fair amount of variability in which words are seen as difficult, though majority results are more consistent.

2021

pdf bib
CovRelex: A COVID-19 Retrieval System with Relation Extraction
Vu Tran | Van-Hien Tran | Phuong Nguyen | Chau Nguyen | Ken Satoh | Yuji Matsumoto | Minh Nguyen
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

This paper presents CovRelex, a scientific paper retrieval system targeting entities and relations via relation extraction on COVID-19 scientific papers. This work aims at building a system supporting users efficiently in acquiring knowledge across a huge number of COVID-19 scientific papers published rapidly. Our system can be accessed via https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex/.