José Ochoa-Luna

Also published as: Jose Ochoa-Luna


2020

pdf bib
Paraphrase Generation via Adversarial Penalizations
Gerson Vizcarra | Jose Ochoa-Luna
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

Paraphrase generation is an important problem in Natural Language Processing that has been addressed with neural network-based approaches recently. This paper presents an adversarial framework to address the paraphrase generation problem in English. Unlike previous methods, we employ the discriminator output as penalization instead of using policy gradients, and we propose a global discriminator to avoid the Monte-Carlo search. In addition, this work use and compare different settings of input representation. We compare our methods to some baselines in the Quora question pairs dataset. The results show that our framework is competitive against the previous benchmarks.

pdf bib
Palomino-Ochoa at SemEval-2020 Task 9: Robust System Based on Transformer for Code-Mixed Sentiment Classification
Daniel Palomino | José Ochoa-Luna
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We present a transfer learning system to perform a mixed Spanish-English sentiment classification task. Our proposal uses the state-of-the-art language model BERT and embed it within a ULMFiT transfer learning pipeline. This combination allows us to predict the polarity detection of code-mixed (English-Spanish) tweets. Thus, among 29 submitted systems, our approach (referred to as dplominop) is ranked 4th on the Sentimix Spanglish test set of SemEval 2020 Task 9. In fact, our system yields the weighted-F1 score value of 0.755 which can be easily reproduced — the source code and implementation details are made available.

pdf bib
A Corpus for Outbreak Detection of Diseases Prevalent in Latin America
Antonella Dellanzo | Viviana Cotik | Jose Ochoa-Luna
Proceedings of the 24th Conference on Computational Natural Language Learning

In this paper we present an annotated corpus which can be used for training and testing algorithms to automatically extract information about diseases outbreaks from news and health reports. We also propose initial approaches to extract information from it. The corpus has been constructed with two main tasks in mind. The first one, to extract entities about outbreaks such as disease, host, location among others. The second one, to retrieve relations among entities, for instance, in such geographic location fifteen cases of a given disease were reported. Overall, our goal is to offer resources and tools to perform an automated analysis so as to support early detection of disease outbreaks and therefore diminish their spreading.