2024
pdf
bib
abs
FactAlign: Fact-Level Hallucination Detection and Classification Through Knowledge Graph Alignment
Mohamed Rashad
|
Ahmed Zahran
|
Abanoub Amin
|
Amr Abdelaal
|
Mohamed Altantawy
Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024)
This paper proposes a novel black-box approach for fact-level hallucination detection and classification by transforming the problem into a knowledge graph alignment task. This approach allows us to classify detected hallucinations as either intrinsic or extrinsic. The paper starts by discussing the field of hallucination detection and introducing several approaches to related work. Then, we introduce the proposed FactAlign approach for hallucination detection and discuss how we can use it to classify hallucinations as either intrinsic or extrinsic. Experiments are carried out to evaluate the proposed method against state-of-the-art methods on the hallucination detection task using the WikiBio GPT-3 hallucination dataset, and on the hallucination type classification task using the XSum hallucination annotations dataset. The experimental results show that our method achieves a 0.889 F1 score for the hallucination detection and 0.825 F1 for the hallucination type classification, without any further training, fine-tuning, or producing multiple samples of the LLM response.
2019
pdf
bib
abs
Summarizing Relationships for Interactive Concept Map Browsers
Abram Handler
|
Premkumar Ganeshkumar
|
Brendan O’Connor
|
Mohamed AlTantawy
Proceedings of the 2nd Workshop on New Frontiers in Summarization
Concept maps are visual summaries, structured as directed graphs: important concepts from a dataset are displayed as vertexes, and edges between vertexes show natural language descriptions of the relationships between the concepts on the map. Thus far, preliminary attempts at automatically creating concept maps have focused on building static summaries. However, in interactive settings, users will need to dynamically investigate particular relationships between pairs of concepts. For instance, a historian using a concept map browser might decide to investigate the relationship between two politicians in a news archive. We present a model which responds to such queries by returning one or more short, importance-ranked, natural language descriptions of the relationship between two requested concepts, for display in a visual interface. Our model is trained on a new public dataset, collected for this task.
2014
pdf
bib
Using Simple NLP Tools to Trace the Globalization of the Art World
Mohamed AlTantawy
|
Alix Rule
|
Owen Rambow
|
Zhongyu Wang
|
Rupayan Basu
Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science
2013
pdf
bib
DIRA: Dialectal Arabic Information Retrieval Assistant
Arfath Pasha
|
Mohammad Al-Badrashiny
|
Mohamed Altantawy
|
Nizar Habash
|
Manoj Pooleery
|
Owen Rambow
|
Ryan M. Roth
|
Mona Diab
The Companion Volume of the Proceedings of IJCNLP 2013: System Demonstrations
2011
pdf
bib
Fast Yet Rich Morphological Analysis
Mohamed Altantawy
|
Nizar Habash
|
Owen Rambow
Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
2010
pdf
bib
abs
Morphological Analysis and Generation of Arabic Nouns: A Morphemic Functional Approach
Mohamed Altantawy
|
Nizar Habash
|
Owen Rambow
|
Ibrahim Saleh
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
MAGEAD is a morphological analyzer and generator for Modern Standard Arabic (MSA) and its dialects. We introduced MAGEAD in previous work with an implementation of MSA and Levantine Arabic verbs. In this paper, we port that system to MSA nominals (nouns and adjectives), which are far more complex to model than verbs. Our system is a functional morphological analyzer and generator, i.e., it analyzes to and generates from a representation consisting of a lexeme and linguistic feature-value pairs, where the features are syntactically (and perhaps semantically) meaningful, rather than just morphologically. A detailed evaluation of the current implementation comparing it to a commonly used morphological analyzer shows that it has good morphological coverage with precision and recall scores in the 90s. An error analysis reveals that the majority of recall and precision errors are problems in the gold standard or a result of the discrepancy between different models of form-based/functional morphology.