Jesús-Andrés Ferrer

Also published as: Jesús Andrés-Ferrer, Jesus Andres-Ferrer

2023

We present our work on building large scale sequence-to-sequence models for generating clinical note from patient-doctor conversation. This is formulated as an abstractive summarization task for which we use encoder-decoder transformer model with pointer-generator. We discuss various modeling enhancements to this baseline model which include using subword and multiword tokenization scheme, prefixing the targets with a chain-of-clinical-facts, and training with contrastive loss that is defined over various candidate summaries. We also use flash attention during training and query chunked attention during inference to be able to process long input and output sequences and to improve computational efficiency. Experiments are conducted on a dataset containing about 900K encounters from around 1800 healthcare providers covering 27 specialties. The results are broken down into primary care and non-primary care specialties. Consistent accuracy improvements are observed across both of these categories.

2012

pdf bib

Does more data always yield better translations?
Guillem Gascó | Martha-Alicia Rocha | Germán Sanchis-Trilles | Jesús Andrés-Ferrer | Francisco Casacuberta
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2010

pdf bib abs

This paper presents the submissions of the PRHLT group for the evaluation campaign of the International Workshop on Spoken Language Translation. We focus on the development of reliable translation systems between syntactically different languages (DIALOG task) and on the efficient training of SMT models in resource-rich scenarios (TALK task).

pdf bib

pdf bib