Sergio José Rodríguez Méndez
NYCU
Also published as: Sergio J. Rodriguez Mendez, Sergio Rodríguez Méndez
2024
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Tim Baldwin | Sergio José Rodríguez Méndez | Nicholas Kuo
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Tim Baldwin | Sergio José Rodríguez Méndez | Nicholas Kuo
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Tiny But Mighty: A Crowdsourced Benchmark Dataset for Triple Extraction from Unstructured Text
Muhammad Salman | Armin Haller | Sergio J. Rodriguez Mendez | Usman Naseem
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024
Muhammad Salman | Armin Haller | Sergio J. Rodriguez Mendez | Usman Naseem
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024
In the context of Natural Language Processing (NLP) and Semantic Web applications, constructing Knowledge Graphs (KGs) from unstructured text plays a vital role. Several techniques have been developed for KG construction from text, but the lack of standardized datasets hinders the evaluation of triple extraction methods. The evaluation of existing KG construction approaches is based on structured data or manual investigations. To overcome this limitation, this work introduces a novel dataset specifically designed to evaluate KG construction techniques from unstructured text. Our dataset consists of a diverse collection of compound and complex sentences meticulously annotated by human annotators with potential triples (subject, verb, object). The annotations underwent further scrutiny by expert ontologists to ensure accuracy and consistency. For evaluation purposes, the proposed F-measure criterion offers a robust approach to quantify the relatedness and assess the alignment between extracted triples and the ground-truth triples, providing a valuable tool for evaluating the performance of triple extraction systems. By providing a diverse collection of high-quality triples, our proposed benchmark dataset offers a comprehensive training and evaluation set for refining the performance of state-of-the-art language models on a triple extraction task. Furthermore, this dataset encompasses various KG-related tasks, such as named entity recognition, relation extraction, and entity linking.
Zero- and Few-Shots Knowledge Graph Triplet Extraction with Large Language Models
Andrea Papaluca | Daniel Krefl | Sergio Rodríguez Méndez | Artem Lensky | Hanna Suominen
Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024)
Andrea Papaluca | Daniel Krefl | Sergio Rodríguez Méndez | Artem Lensky | Hanna Suominen
Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024)
In this work, we tested the Triplet Extraction (TE) capabilities of a variety of Large Language Models (LLMs) of different sizes in the Zero- and Few-Shots settings. In detail, we proposed a pipeline that dynamically gathers contextual information from a Knowledge Base (KB), both in the form of context triplets and of (sentence, triplets) pairs as examples, and provides it to the LLM through a prompt. The additional context allowed the LLMs to be competitive with all the older fully trained baselines based on the Bidirectional Long Short-Term Memory (BiLSTM) Network architecture. We further conducted a detailed analysis of the quality of the gathered KB context, finding it to be strongly correlated with the final TE performance of the model. In contrast, the size of the model appeared to only logarithmically improve the TE capabilities of the LLMs. We release the code on GitHub for reproducibility.
2023
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Tuan Dung Nguyen | Yuan-Sen Ting | Ioana Ciuca | Charles O’Neill | Ze-Chang Sun | Maja Jabłońska | Sandor Kruk | Ernest Perkowski | Jack Miller | Jason Jason Jingsh Li | Josh Peek | Kartheik Iyer | Tomasz Rozanski | Pranav Khetarpal | Sharaf Zaman | David Brodrick | Sergio J. Rodriguez Mendez | Thang Bui | Alyssa Goodman | Alberto Accomazzi | Jill Naiman | Jesse Cranney | Kevin Schawinski | Roberta Raileanu
Proceedings of the Second Workshop on Information Extraction from Scientific Publications
Tuan Dung Nguyen | Yuan-Sen Ting | Ioana Ciuca | Charles O’Neill | Ze-Chang Sun | Maja Jabłońska | Sandor Kruk | Ernest Perkowski | Jack Miller | Jason Jason Jingsh Li | Josh Peek | Kartheik Iyer | Tomasz Rozanski | Pranav Khetarpal | Sharaf Zaman | David Brodrick | Sergio J. Rodriguez Mendez | Thang Bui | Alyssa Goodman | Alberto Accomazzi | Jill Naiman | Jesse Cranney | Kevin Schawinski | Roberta Raileanu
Proceedings of the Second Workshop on Information Extraction from Scientific Publications
Search
Fix author
Co-authors
- Alberto Accomazzi 1
- Timothy Baldwin 1
- David Brodrick 1
- Thang Bui 1
- Ioana Ciuca 1
- Jesse Cranney 1
- Alyssa Goodman 1
- Armin Haller 1
- Kartheik Iyer 1
- Maja Jabłońska 1
- Pranav Khetarpal 1
- Daniel Krefl 1
- Sandor Kruk 1
- Nicholas Kuo 1
- Artem Lensky 1
- Jason Jason Jingsh Li 1
- Jack Miller 1
- Jill Naiman 1
- Usman Naseem 1
- Tuan Dung Nguyen 1
- Charles O’Neill 1
- Andrea Papaluca 1
- Josh Peek 1
- Ernest Perkowski 1
- Roberta Raileanu 1
- Tomasz Rozanski 1
- Muhammad Salman 1
- Kevin Schawinski 1
- Ze-Chang Sun 1
- Hanna Suominen 1
- Yuan-Sen Ting 1
- Sharaf Zaman 1