Marco De Gemmis

Also published as: Marco de Gemmis


2024

pdf bib
Unraveling the Enigma of SPLIT in Large-Language Models: The Unforeseen Impact of System Prompts on LLMs with Dissociative Identity Disorder
Marco Polignano | Marco De Gemmis | Giovanni Semeraro
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

Our work delves into the unexplored territory of Large-Language Models (LLMs) and their interactions with System Prompts, unveiling the previously undiscovered implications of SPLIT (System Prompt Induced Linguistic Transmutation) in commonly used state-of-the-art LLMs. Dissociative Identity Disorder, a complex and multifaceted mental health condition, is characterized by the presence of two or more distinct identities or personas within an individual, often with varying levels of awareness and control. The advent of large-language models has raised intriguing questions about the presence of such conditions in LLMs. Our research investigates the phenomenon of SPLIT, in which the System Prompt, a seemingly innocuous input, profoundly impacts the linguistic outputs of LLMs. The findings of our study reveal a striking correlation between the System Prompt and the emergence of distinct, persona-like linguistic patterns in the LLM’s responses. These patterns are not only reminiscent of the dissociative identities present in the original data but also exhibit a level of coherence and consistency that is uncommon in typical LLM outputs. As we continue to explore the capabilities of LLMs, it is imperative that we maintain a keen awareness of the potential for SPLIT and its significant implications for the development of more human-like and empathetic AI systems.

2021

pdf bib
Extracting Relations from Italian Wikipedia using Self-Training
Lucia Siciliani | Pierluigi Cassotti | Pierpaolo Basile | Marco de Gemmis | Pasquale Lops | Giovanni Semeraro
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

pdf bib
Emerging Trends in Gender-Specific Occupational Titles in Italian Newspapers
Pierluigi Cassotti | Andrea Iovine | Pierpaolo Basile | Marco De Gemmis | Giovanni Semeraro
Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it 2021)

2020

pdf bib
Exploiting Distributional Semantics Models for Natural Language Context-aware Justifications for Recommender Systems
Giuseppe Spillo | Cataldo Musto | Marco de Gemmis | Pasquale Lops | Giovanni Semeraro
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib
Analysis of Lexical Semantic Changes in Corpora with the Diachronic Engine
Pierluigi Cassotti | Pierpaolo Basile | Marco de Gemmis | Giovanni Semeraro
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib
A Deep Learning Model for the Analysis of Medical Reports in ICD-10 Clinical Coding Task
Marco Polignano | Pierpaolo Basile | Marco de Gemmis | Pasquale Lops | Giovanni Semeraro
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

2019

pdf bib
AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets
Marco Polignano | Pierpaolo Basile | Marco de Gemmis | Giovanni Semeraro | Valerio Basile
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib
A Dataset of Real Dialogues for Conversational Recommender Systems
Andrea Iovine | Fedelucio Narducci | Marco de Gemmis
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)

pdf bib
SWAP at SemEval-2019 Task 3: Emotion detection in conversations through Tweets, CNN and LSTM deep neural networks
Marco Polignano | Marco de Gemmis | Giovanni Semeraro
Proceedings of the 13th International Workshop on Semantic Evaluation

Emotion detection from user-generated contents is growing in importance in the area of natural language processing. The approach we proposed for the EmoContext task is based on the combination of a CNN and an LSTM using a concatenation of word embeddings. A stack of convolutional neural networks (CNN) is used for capturing the hierarchical hidden relations among embedding features. Meanwhile, a long short-term memory network (LSTM) is used for capturing information shared among words of the sentence. Each conversation has been formalized as a list of word embeddings, in particular during experimental runs pre-trained Glove and Google word embeddings have been evaluated. Surface lexical features have been also considered, but they have been demonstrated to be not usefully for the classification in this specific task. The final system configuration achieved a micro F1 score of 0.7089. The python code of the system is fully available at https://github.com/marcopoli/EmoContext2019

2008

pdf bib
Combining Knowledge-based Methods and Supervised Learning for Effective Italian Word Sense Disambiguation
Pierpaolo Basile | Marco de Gemmis | Pasquale Lops | Giovanni Semeraro
Semantics in Text Processing. STEP 2008 Conference Proceedings

2007

pdf bib
UNIBA: JIGSAW algorithm for Word Sense Disambiguation
Pierpaolo Basile | Marco de Gemmis | Anna Lisa Gentile | Pasquale Lops | Giovanni Semeraro
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)