Alberto Díaz

Also published as: Alberto Diaz


2024

pdf bib
Automated Extraction of Prosodic Structure from Unannotated Sign Language Video
Antonio F. G. Sevilla | José María Lahoz-Bengoechea | Alberto Diaz
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

As in oral phonology, prosody is an important carrier of linguistic information in sign languages. One of the most prominent ways this reveals itself is in the time structure of signs: their rhythm and intensity of articulation. To be able to empirically see these effects, the velocity of the hands can be computed throughout the execution of a sign. In this article, we propose a method for extracting this information from unlabeled videos of sign language, exploiting CoTracker, a recent advancement in computer vision which can track every point in a video without the need of any calibration or fine-tuning. The dominant hand is identified via clustering of the computed point velocities, and its dynamic profile plotted to make apparent the prosodic structure of signing. We apply our method to different datasets and sign languages, and perform a preliminary visual exploration of results. This exploration supports the usefulness of our methodology for linguistic analysis, though issues to be tackled remain, such as bi-manual signs and a formal and numerical evaluation of accuracy. Nonetheless, the absence of any preprocessing requirements may make it useful for other researchers and datasets.

2016

pdf bib
Improving Information Extraction from Wikipedia Texts using Basic English
Teresa Rodríguez-Ferreira | Adrián Rabadán | Raquel Hervás | Alberto Díaz
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The aim of this paper is to study the effect that the use of Basic English versus common English has on information extraction from online resources. The amount of online information available to the public grows exponentially, and is potentially an excellent resource for information extraction. The problem is that this information often comes in an unstructured format, such as plain text. In order to retrieve knowledge from this type of text, it must first be analysed to find the relevant details, and the nature of the language used can greatly impact the quality of the extracted information. In this paper, we compare triplets that represent definitions or properties of concepts obtained from three online collaborative resources (English Wikipedia, Simple English Wikipedia and Simple English Wiktionary) and study the differences in the results when Basic English is used instead of common English. The results show that resources written in Basic English produce less quantity of triplets, but with higher quality.

2013

pdf bib
NIL_UCM: Extracting Drug-Drug interactions from text through combination of sequence and tree kernels
Behrouz Bokharaeian | Alberto Díaz
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
UCM-I: A Rule-based Syntactic Approach for Resolving the Scope of Negation
Jorge Carrillo de Albornoz | Laura Plaza | Alberto Díaz | Miguel Ballesteros
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
UCM-2: a Rule-Based Approach to Infer the Scope of Negation via Dependency Parsing
Miguel Ballesteros | Alberto Díaz | Virginia Francisco | Pablo Gervás | Jorge Carrillo de Albornoz | Laura Plaza
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2010

pdf bib
Development and Use of an Evaluation Collection for Personalisation of Digital Newspapers
Alberto Díaz | Pablo Gervás | Antonio García | Laura Plaza
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper presents the process of development and the characteristics of an evaluation collection for a personalisation system for digital newspapers. This system selects, adapts and presents contents according to a user model that define information needs. The collection presented here contains data that are cross-related over four different axes: a set of news items from an electronic newspaper, collected into subsets corresponding to a particular sequence of days, packaged together and cross-indexed with a set of user profiles that represent the particular evolution of interests of a set of real users over the given days, expressed in each case according to four different representation frameworks: newspaper sections, Yahoo categories, keywords, and relevance feedback over the set of news items for the previous day. This information provides a minimum starting material over which one can evaluate for a given system how it addresses the first two observations - adapting to different users and adapting to particular users over time - providing that the particular system implements the representation of information needs according to the four frameworks employed in the collection. This collection has been successfully used to perform some different experiments to determine the effectiveness of the personalization system presented.

pdf bib
Improving Summarization of Biomedical Documents Using Word Sense Disambiguation
Laura Plaza | Mark Stevenson | Alberto Díaz
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

2008

pdf bib
Concept-Graph Based Biomedical Automatic Summarization Using Ontologies
Laura Plaza | Alberto Díaz | Pablo Gervás
Coling 2008: Proceedings of the 3rd Textgraphs workshop on Graph-based Algorithms for Natural Language Processing