Bang Nguyen
2024
Reference-based Metrics Disprove Themselves in Question Generation
Bang Nguyen
|
Mengxia Yu
|
Yun Huang
|
Meng Jiang
Findings of the Association for Computational Linguistics: EMNLP 2024
Reference-based metrics such as BLEU and BERTScore are widely used to evaluate question generation (QG). In this study, on QG benchmarks such as SQuAD and HotpotQA, we find that using human-written references cannot guarantee the effectiveness of the reference-based metrics. Most QG benchmarks have only one reference; we replicate the annotation process and collect another reference. A good metric is expected to grade a human-validated question no worse than generated questions. However, the results of reference-based metrics on our newly collected reference disproved the metrics themselves. We propose a reference-free metric consisted of multi-dimensional criteria such as naturalness, answerability, and complexity, utilizing large language models. These criteria are not constrained to the syntactic or semantic of a single reference question, and the metric does not require a diverse set of references. Experiments reveal that our metric accurately distinguishes between high-quality questions and flawed ones, and achieves state-of-the-art alignment with human judgment.
2023
Embedding Mental Health Discourse for Community Recommendation
Hy Dang
|
Bang Nguyen
|
Noah Ziems
|
Meng Jiang
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
Our paper investigates the use of discourse embedding techniques to develop a community recommendation system that focuses on mental health support groups on social media. Social media platforms provide a means for users to anonymously connect with communities that cater to their specific interests. However, with the vast number of online communities available, users may face difficulties in identifying relevant groups to address their mental health concerns. To address this challenge, we explore the integration of discourse information from various subreddit communities using embedding techniques to develop an effective recommendation system. Our approach involves the use of content-based and collaborative filtering techniques to enhance the performance of the recommendation system. Our findings indicate that the proposed approach outperforms the use of each technique separately and provides interpretability in the recommendation process.
2020
Introducing a Large-Scale Dataset for Vietnamese POS Tagging on Conversational Texts
Oanh Tran
|
Tu Pham
|
Vu Dang
|
Bang Nguyen
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper introduces a large-scale human-labeled dataset for the Vietnamese POS tagging task on conversational texts. To this end, wepropose a new tagging scheme (with 36 POS tags) consisting of exclusive tags for special phenomena of conversational words, developthe annotation guideline and manually annotate 16.310K sentences using this guideline. Based on this corpus, a series of state-of-the-art tagging methods has been conducted to estimate their performances. Experimental results showed that the Conditional Random Fields model using both automatically learnt features from deep neural networks and handcrafted features yielded the best performance. Thismodel achieved 93.36% in the accuracy score which is 1.6% and 2.7% higher than the model using either handcrafted features orautomatically-learnt features, respectively. This result is also a little bit higher than the model of fine-tuning BERT by 0.94% in theaccuracy score. The performance measured on each POS tag is also very high with >90% in the F1 score for 20 POS tags and >80%in the F1 score for 11 POS tags. This work provides the public dataset and preliminary results for follow-up research on this interesting direction.
Search
Co-authors
- Meng Jiang 2
- Mengxia Yu 1
- Yun Huang 1
- Oanh Tran 1
- Tu Pham 1
- show all...