Soroush Gooran
2024
SLPL SHROOM at SemEval2024 Task 06 : A comprehensive study on models ability to detect hallucination
Pouya Fallah
|
Soroush Gooran
|
Mohammad Jafarinasab
|
Pouya Sadeghi
|
Reza Farnia
|
Amirreza Tarabkhah
|
Zeinab Sadat Taghavi
|
Hossein Sameti
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Language models, particularly generative models, are susceptible to hallucinations, generating outputs that contradict factual knowledgeor the source text. This study explores methodsfor detecting hallucinations in three SemEval2024 Task 6 tasks: Machine Translation, Definition Modeling, and Paraphrase Generation.We evaluate two methods: semantic similaritybetween the generated text and factual references, and an ensemble of language modelsthat judge each other’s outputs. Our resultsshow that semantic similarity achieves moderate accuracy and correlation scores in trial data,while the ensemble method offers insights intothe complexities of hallucination detection butfalls short of expectations. This work highlights the challenges of hallucination detectionand underscores the need for further researchin this critical area.
2023
Ebhaam at SemEval-2023 Task 1: A CLIP-Based Approach for Comparing Cross-modality and Unimodality in Visual Word Sense Disambiguation
Zeinab Taghavi
|
Parsa Haghighi Naeini
|
Mohammad Ali Sadraei Javaheri
|
Soroush Gooran
|
Ehsaneddin Asgari
|
Hamid Reza Rabiee
|
Hossein Sameti
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper presents an approach to tackle the task of Visual Word Sense Disambiguation (Visual-WSD), which involves determining the most appropriate image to represent a given polysemous word in one of its particular senses. The proposed approach leverages the CLIP model, prompt engineering, and text-to-image models such as GLIDE and DALL-E 2 for both image retrieval and generation. To evaluate our approach, we participated in the SemEval 2023 shared task on “Visual Word Sense Disambiguation (Visual-WSD)” using a zero-shot learning setting, where we compared the accuracy of different combinations of tools, including “Simple prompt-based” methods and “Generated prompt-based” methods for prompt engineering using completion models, and text-to-image models for changing input modality from text to image. Moreover, we explored the benefits of cross-modality evaluation between text and candidate images using CLIP. Our experimental results demonstrate that the proposed approach reaches better results than cross-modality approaches, highlighting the potential of prompt engineering and text-to-image models to improve accuracy in Visual-WSD tasks. We assessed our approach in a zero-shot learning scenario and attained an accuracy of 68.75\% in our best attempt.