Italo Luis Da Silva

Also published as: Italo Luis da Silva

2025

GraphMind: Interactive Novelty Assessment System for Accelerating Scientific Discovery
Italo Luis da Silva | Hanqi Yan | Lin Gui | Yulan He
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Large Language Models (LLMs) show strong reasoning and text generation capabilities, prompting their use in scientific literature analysis, including novelty assessment. While evaluating novelty of scientific papers is crucial for peer review, it requires extensive knowledge of related work, something not all reviewers have.While recent work on LLM-assisted scientific literature analysis supports literature comparison, existing approaches offer limited transparency and lack mechanisms for result traceability via an information retrieval module. To address this gap, we introduce GraphMind, an easy-to-use interactive web tool designed to assist users in evaluating the novelty of scientific papers or drafted ideas. Specially, GraphMind enables users to capture the main structure of a scientific paper, explore related ideas through various perspectives, and assess novelty via providing verifiable contextual insights. GraphMind enables users to annotate key elements of a paper, explore related papers through various relationships, and assess novelty with contextual insight. This tool integrates external APIs such as arXiv and Semantic Scholar with LLMs to support annotation, extraction, retrieval and classification of papers. This combination provides users with a rich, structured view of a scientific idea’s core contributions and its connections to existing work. GraphMind is available at https://oyarsa.github.io/graphmind and a demonstration video at https://youtu.be/wKbjQpSvwJg.

2024

pdf bib abs

Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
Italo Luis Da Silva | Hanqi Yan | Lin Gui | Yulan He
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The inherent ambiguity of cause and effect boundaries poses a challenge in evaluating causal event extraction tasks. Traditional metrics like Exact Match and BertScore poorly reflect model performance, so we trained evaluation models to approximate human evaluation, achieving high agreement. We used them to perform Reinforcement Learning with extraction models to align them with human preference, prioritising semantic understanding. We successfully explored our approach through multiple datasets, including transferring an evaluator trained on one dataset to another as a way to decrease the reliance on human-annotated data. In that vein, we also propose a weak-to-strong supervision method that uses a fraction of the annotated data to train an evaluation model while still achieving high performance in training an RL model.

Co-authors

Venues

EMNLP2

Fix author