Jian-Tao Huang

Also published as: Jian-tao Huang


2024

pdf bib
NumHG: A Dataset for Number-Focused Headline Generation
Jian-Tao Huang | Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Headline generation, a key task in abstractive summarization, strives to condense a full-length article into a succinct, single line of text. Notably, while contemporary encoder-decoder models excel based on the ROUGE metric, they often falter when it comes to the precise generation of numerals in headlines. We identify the lack of datasets providing fine-grained annotations for accurate numeral generation as a major roadblock. To address this, we introduce a new dataset, the NumHG, and provide over 27,000 annotated numeral-rich news articles for detailed investigation. Further, we evaluate five well-performing models from previous headline-generation tasks using human evaluation in terms of numerical accuracy, reasonableness, and readability. Our study reveals a need for improvement in numerical accuracy, demonstrating the potential of the NumHG dataset to drive progress in number-focused headline generation and stimulate further discussions in numeral-focused text generation.

pdf bib
SemEval-2024 Task 7: Numeral-Aware Language Understanding and Generation
Chung-chi Chen | Jian-tao Huang | Hen-hsen Huang | Hiroya Takamura | Hsin-hsi Chen
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Numbers are frequently utilized in both our daily narratives and professional documents, such as clinical notes, scientific papers, financial documents, and legal court orders. The ability to understand and generate numbers is thus one of the essential aspects of evaluating large language models. In this vein, we propose a collection of datasets in SemEval-2024 Task 7 - NumEval. This collection encompasses several tasks focused on numeral-aware instances, including number prediction, natural language inference, question answering, reading comprehension, reasoning, and headline generation. This paper offers an overview of the dataset and presents the results of all subtasks in NumEval. Additionally, we contribute by summarizing participants’ methods and conducting an error analysis. To the best of our knowledge, NumEval represents one of the early tasks that perform peer evaluation in SemEval’s history. We will further share observations from this aspect and provide suggestions for future SemEval tasks.