Marco Zocca


2025

This paper presents lessons learned from implementing Machine Translation systems in the context of a global medical technology company. We describe system challenges, legal and security considerations, and the critical role of human-in-the-loop validation for quality assurance and responsible deployment. Furthermore, based on an experiment involving over 11,000 ranked translations, we report reviewer preferences for outputs from small and large language models under various prompting configurations, using a domain-specific dataset spanning five language pairs.

2023

By grounding natural language inference in code (and vice versa), researchers aim to create programming assistants that explain their work, are “coachable” and can surface any gaps in their reasoning. Can we deduce automatically interesting properties of programs from their syntax and common-sense annotations alone, without resorting to static analysis? How much of program logic and behaviour can be captured in natural language? To stimulate research in this direction and attempt to answer these questions we propose HTL, a dataset and protocol for annotating programs with natural language predicates at a finer granularity than code comments and without relying on internal compiler representations. The dataset is available at the following address: https://doi.org/10.5281/zenodo.7893113 .