Fumiko Satoh


pdf bib
Content Aware Source Code Change Description Generation
Pablo Loyola | Edison Marrese-Taylor | Jorge Balazs | Yutaka Matsuo | Fumiko Satoh
Proceedings of the 11th International Conference on Natural Language Generation

We propose to study the generation of descriptions from source code changes by integrating the messages included on code commits and the intra-code documentation inside the source in the form of docstrings. Our hypothesis is that although both types of descriptions are not directly aligned in semantic terms —one explaining a change and the other the actual functionality of the code being modified— there could be certain common ground that is useful for the generation. To this end, we propose an architecture that uses the source code-docstring relationship to guide the description generation. We discuss the results of the approach comparing against a baseline based on a sequence-to-sequence model, using standard automatic natural language generation metrics as well as with a human study, thus offering a comprehensive view of the feasibility of the approach.

pdf bib
Villani at SemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning
Pablo Loyola | Kugamoorthy Gajananan | Yuji Watanabe | Fumiko Satoh
Proceedings of the 12th International Workshop on Semantic Evaluation

In this paper, we describe our proposal for the task of Semantic Extraction from Cybersecurity Reports. The goal is to explore if natural language processing methods can provide relevant and actionable knowledge to contribute to better understand malicious behavior. Our method consists of an attention-based Bi-LSTM which achieved competitive performance of 0.57 for the Subtask 1. In the due process we also present ablation studies across multiple embeddings and their level of representation and also report the strategies we used to mitigate the extreme imbalance between classes.