Aurora Gensale


2025

Automatic and early detection of foodborne hazards is crucial for preventing outbreaks. Existing AI-based solutions often struggle with the complexity and noise of food recall reports and overcome the dependency between product and hazard labels. We introduce a methodology to classify reports on food-related incidents to address these challenges. Our approach leverages LLM-based information extraction to minimize report variability, alongside a two-stage classification pipeline. The first model assigns coarse-grained labels, narrowing the space of eligible fine-grained labels for the second model. This sequential process allows us to capture hierarchical label dependencies between products and hazards and their respective categories. Additionally, we design each model with two classification heads relying on the inherent relations between food products and associated hazards. We validate our approach on two multi-label classification sub-tasks. Experimental results demonstrate the effectiveness of our approach, achieving an improvement of +30% and +40% in classification performance compared to the baseline.