Alicia Picazo-Izquierdo


2026

Translation Memory (TM) systems are core components of commercial computer-aided translation (CAT) tools. However, traditional fuzzy matching methods often fail to retrieve semantically relevant content when surface similarity is low. We introduce SmartMatch, an open-source interactive demo and evaluation toolkit for TM retrieval that connects modern sentence encoders (including LLM-derived representations) and strong lexical/fuzzy baselines with a vector database, and exposes the end-to-end retrieval pipeline through a web-based UI for qualitative inspection and preference logging. The demo allows users to (i) enter a query segment, (ii) switch retrieval backends and embedding models, (iii) inspect top-k retrieved matches with similarity scores and qualitative cues, and (iv) observe end-to-end latency in real time. We provide a reproducible benchmark on multilingual TM data, reporting retrieval quality using reference-based MT metrics (COMET, BERTScore, METEOR, chrF) together with coverage and latency/throughput trade-offs relevant to real-time CAT workflows. On DGT-TM, encoder-based retrieval achieves full coverage (100%) with millisecond-level latency (p50/p90 6–20 ms) and attains the strongest semantic-quality scores on the shared query set (e.g., BERTScore up to 0.91 at k=10), while BM25 remains a strong lightweight lexical baseline with very low latency. SmartMatch targets CAT researchers and tool builders and bridges recent advances in sentence encoders with the real-time constraints of translation memory retrieval.

2025

The growing use of generative AI in academic writing raises urgent questions about authorship and the integrity of scientific communication. This study addresses the detection of AI-generated scientific abstracts by constructing a temporally anchored dataset of paired abstracts—each with a human-written version that contains scientific abstracts of works published before 2021 and a synthetic version generated using GPT-4.1. We evaluate three approaches to authorship classification: zero-shot large language models (LLMs), fine-tuned encoder-based transformers, and traditional machine learning classifiers. Results show that LLMs perform near chance level, while a LoRA-fine-tuned DistilBERT and a PassiveAggressive classifier achieve near-perfect performance. These findings suggest that shallow lexical or stylistic patterns still differentiate human and AI writing, and that supervised learning is key to capturing these signals.