Hannah Bechara

Also published as: Hanna Béchara, Hanna Bechara, Hannah Béchara


2025

pdf bib
The Missing Cause: An Analysis of Causal Attributions in Reporting on Palestine
Paulina Garcia Corral | Hannah Bechara | Krishnamoorthy Manohara | Slava Jankin
Proceedings of the first International Workshop on Nakba Narratives as Language Resources

Missing cause bias is a specific type of bias in media reporting that relies on consistently omitting causal attribution to specific events, for example when omitting specific actors as causes of incidents. Identifying these patterns in news outlets can be helpful in assessing the level of bias present in media content. In this paper, we examine the prevalence of this bias in reporting on Palestine by identifying causal constructions in headlines. We compare headlines from three main news media outlets: CNN, the BBC, and AJ (AlJazeera), that cover the Israel-Palestine conflict. We also collect and compare these findings to data related to the Ukraine-Russia war to analyze editorial style within press organizations. We annotate a subset of this data and evaluate two causal language models (UniCausal and GPT-4o) for the identification and extraction of causal language in news headlines. Using the top performing model, GPT-4o, we machine annotate the full corpus and analyze missing bias prevalence within and across news organizations. Our findings reveal that BBC headlines tend to avoid directly attributing causality to Israel for the violence in Gaza, both when compared to other news outlets, and to its own reporting on other conflicts.

2024

pdf bib
Creating and Evaluating a Multilingual Corpus of UN General Assembly Debates
Hannah Bechara | Krishnamoorthy Manohara | Slava Jankin
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

This paper presents a multilingual aligned corpus of political debates from the United Nations (UN) General Assembly sessions between 1978 and 2021, which covers five of the six official UN languages: Arabic, Chinese, English, French, Russian, and Spanish. We explain the preprocessing steps we applied to the corpus. We align the sentences by using word vectors to numerically represent the meaning of each sentence and then calculating the Euclidean distance between them. To validate our alignment methods, we conducted an evaluation study with crowd-sourced human annotators using Scale AI, an online platform for data labelling. The final dataset consists of around 300,000 aligned sentences for En-Es, En-Fr, En-Zh and En-Ru. It is publicly available for download.

pdf bib
PolitiCause: An Annotation Scheme and Corpus for Causality in Political Texts
Paulina Garcia Corral | Hanna Bechara | Ran Zhang | Slava Jankin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we present PolitiCAUSE, a new corpus of political texts annotated for causality. We provide a detailed and robust annotation scheme for annotating two types of information: (1) whether a sentence contains a causal relation or not, and (2) the spans of text that correspond to the cause and effect components of the causal relation. We also provide statistics and analysis of the corpus, and outline the difficulties and limitations of the task. Finally, we test out two transformer-based classification models on our dataset as a form of evaluation. The models achieve a moderate performance on the dataset, with a MCC score of 0.62. Our results show that PolitiCAUSE is a valuable resource for studying causality in texts, especially in the domain of political discourse, and that there is still room for improvement in developing more accurate and robust methods for this problem.

2016

pdf bib
WOLVESAAR at SemEval-2016 Task 1: Replicating the Success of Monolingual Word Alignment and Neural Embeddings for Semantic Textual Similarity
Hannah Bechara | Rohit Gupta | Liling Tan | Constantin Orăsan | Ruslan Mitkov | Josef van Genabith
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
Semantic Textual Similarity in Quality Estimation
Hanna Bechara | Carla Parra Escartin | Constantin Orasan | Lucia Specia
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

2015

pdf bib
A Deeper Exploration of the Standard PB-SMT Approach to Text Simplification and its Evaluation
Sanja Štajner | Hannah Béchara | Horacio Saggion
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
MiniExperts: An SVM Approach for Measuring Semantic Textual Similarity
Hanna Béchara | Hernani Costa | Shiva Taslimipoor | Rohit Gupta | Constantin Orasan | Gloria Corpas Pastor | Ruslan Mitkov
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Intelligent translation memory matching and retrieval metric exploiting linguistic technology
Rohit Gupta | Hanna Bechara | Constantin Orasan
Proceedings of Translating and the Computer 36

pdf bib
UoW: NLP techniques developed at the University of Wolverhampton for Semantic Similarity and Textual Entailment
Rohit Gupta | Hanna Béchara | Ismail El Maarouf | Constantin Orăsan
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2012

pdf bib
An Evaluation of Statistical Post-Editing Systems Applied to RBMT and SMT Systems
Hanna Béchara | Raphaël Rubino | Yifan He | Yanjun Ma | Josef van Genabith
Proceedings of COLING 2012

2011

pdf bib
Statistical Post-Editing for a Statistical MT System
Hanna Bechara | Yanjun Ma | Josef van Genabith
Proceedings of Machine Translation Summit XIII: Papers