2024
pdf
bib
abs
Impact of Syntactic Complexity on the Processes and Performance of Large Language Models-leveraged Post-editing
Longhui Zou
|
Michael Carl
|
Shaghayegh Momtaz
|
Mehdi Mirzapour
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 2: Presentations)
This research explores the interaction between human translators and Large Language Models (LLMs) during post-editing (PE). The study examines the impact of syntactic complexity on the PE processes and performance, specifically when working with the raw translation output generated by GPT-4. We selected four English source texts (STs) from previous American Translators Association (ATA) certification examinations. Each text is about 10 segments, with 250 words. GPT-4 was employed to translate the four STs from English into simplified Chinese. The empirical experiment simulated the authentic work environment of PE, using professional computer-assisted translation (CAT) tool, Trados. The experiment involved 46 participants with different levels of translation expertise (30 student translators and 16 expert translators), producing altogether 2162 segments of PE versions. We implemented five syntactic complexity metrics in the context of PE for quantitative analysis.
2022
pdf
bib
abs
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction
Mehdi Mirzapour
|
Waleed Ragheb
|
Mohammad Javad Saeedizade
|
Kevin Cousot
|
Helene Jacquenet
|
Lawrence Carbon
|
Mathieu Lafourcade
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Knowledge graphs applications, in industry and academia, motivate substantial research directions towards large-scale information extraction from various types of resources. Nowadays, most of the available knowledge graphs are either in English or multilingual. In this paper, we introduce RezoJDM16k, a French knowledge graph dataset based on RezoJDM. With 16k nodes, 832k triplets, and 53 relation types, RezoJDM16k can be employed in many NLP downstream tasks for the French language such as machine translation, question-answering, and recommendation systems. Moreover, we provide strong knowledge graph embedding baselines that are used in link prediction tasks for future benchmarking. Compared to the state-of-the-art English knowledge graph datasets used in link prediction, RezoJDM16k shows a similar promising predictive behavior.
2017
pdf
bib
Quantifier Scoping and Semantic Preferences
Davide Catta
|
Mehdi Mirzapour
Proceedings of the Computing Natural Language Inference Workshop
pdf
bib
abs
Finding Missing Categories in Incomplete Utterances
Mehdi Mirzapour
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. 19es REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017)
Finding Missing Categories in Incomplete Utterances This paper introduces an efficient algorithm (O(n4 )) for finding a missing category in an incomplete utterance by using unification technique as when learning categorial grammars, and dynamic programming as in Cocke–Younger–Kasami algorithm. Using syntax/semantic interface of categorial grammar, this work can be used for deriving possible semantic readings of an incomplete utterance. The paper illustrates the problem with running examples.