Nikolay Karpov


2026

This paper describes the NVIDIA NeMo team’s submission to the IWSLT 2026 Simultaneous Speech Translation (SimulST) tracks. We use a cascaded architecture combining a dual-mode Unified ASR Transducer model with a multilingual Large Language Model (LLM). The ASR is trained to deliver stable transcriptions across wide range of latencies, providing a reliable foundation for high-quality LLM translation. Our submission participates in the English–German, English–Italian, and English–Chinese tasks, in both standard and contextualized settings, as well as the Czech–English standard track, covering both low- and high-latency scenarios. We further analyze how ASR and LLM design choices affect the system’s overall latency and translation quality.

2017

In many areas, such as social science, politics or market research, people need to deal with dataset shifting over time. Distribution drift phenomenon usually appears in the field of sentiment analysis, when proportions of instances are changing over time. In this case, the task is to correctly estimate proportions of each sentiment expressed in the set of documents (quantification task). Basically, our study was aimed to analyze the effectiveness of a mixture of quantification technique with one of deep learning architecture. All the techniques are evaluated using the SemEval-2017 Task4 dataset and source code, mentioned in this paper and available online in the Python programming language. The results of an application of the quantification techniques are discussed.

2016