Junzhe Zhao


2024

pdf bib
NCL_NLP at SemEval-2024 Task 7: CoT-NumHG: A CoT-Based SFT Training Strategy with Large Language Models for Number-Focused Headline Generation
Junzhe Zhao | Yingxi Wang | Huizhi Liang | Nicolay Rusnachenko
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Headline Generation is an essential task in Natural Language Processing (NLP), where models often exhibit limited ability to accurately interpret numerals, leading to inaccuracies in generated headlines. This paper introduces CoT-NumHG, a training strategy leveraging the Chain of Thought (CoT) paradigm for Supervised Fine-Tuning (SFT) of large language models. This approach is aimed at enhancing numeral perception, interpretability, accuracy, and the generation of structured outputs. Presented in SemEval-2024 Task 7 (task 3): Numeral-Aware Headline Generation (English), this challenge is divided into two specific subtasks. The first subtask focuses on numerical reasoning, requiring models to precisely calculate and fill in the missing numbers in news headlines, while the second subtask targets the generation of complete headlines. Utilizing the same training strategy across both subtasks, this study primarily explores the first subtask as a demonstration of our training strategy. Through this competition, our CoT-NumHG-Mistral-7B model attained an accuracy rate of 94%, underscoring the effectiveness of our proposed strategy.

2023

pdf bib
Legal_try at SemEval-2023 Task 6: Voting Heterogeneous Models for Entities identification in Legal Documents
Junzhe Zhao | Yingxi Wang | Nicolay Rusnachenko | Huizhi Liang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Named Entity Recognition (NER) is a subtask of Natural Language Processing (NLP) that involves identifying and categorizing named entities. The result annotation makes unstructured natural texts applicable for other NLP tasks, including information retrieval, question answering, and machine translation. NER is also essential in legal as an initial stage in extracting relevant entities. However, legal texts contain domain-specific named entities, such as applicants, defendants, courts, statutes, and articles. The latter makes standard named entity recognizers incompatible with legal documents. This paper proposes an approach combining multiple models’ results via a voting mechanism for unique entity identification in legal texts. This endeavor focuses on extracting legal named entities, and the specific assignment (task B) is to create a legal NER system for unique entity annotation in legal documents. The results of our experiments and system implementation are published in https://github.com/SuperEDG/Legal_Project.

2006

pdf bib
Discovering Relations among Named Entities by Detecting Community Structure
Tingting He | Junzhe Zhao | Jing Li
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation