Nguyen Chi Tran


2026

We present our approach to the AbjadNLP 2026 Arabic Authorship Identification shared task, achieving 4th place. Our key finding is that AraBERT-base (110M) outperforms AraBERT-large (340M) on the test set with macro F1 of 0.8449 versus 0.8096, despite lower validation scores. We handle long passages via sliding window chunking with mean pooling, and use a two-stage classification head with dual dropout for regularization. Per-class analysis reveals that translated works achieve perfect F1 while classical poets remain challenging due to shared formal structures. Our results challenge the "scale is all you need" assumption for stylometric tasks.
Medical text classification is high-stakes work, yet models often falter precisely where they are needed most: on rare, critical conditions buried in the long tail of the data distribution. In the EACL 2026 ABJAD-NLP Shared Task, we confronted this challenge with a dataset of Arabic medical questions heavily skewed towards a few common topics, leaving dozens of categories with fewer than ten examples. We present HybridMed, a system that effectively tames this long tail by marrying the semantic generalization of a fine-tuned Arabic BERT model with the precise, instance-based memory of k-nearest neighbor retrieval. This complementary union allowed our system to achieve a macro-F1 score of 0.4902, demonstrating that for diverse and imbalanced medical data, the whole is indeed greater than the sum of its parts.
This paper describes the system developed by team HCMUS_The Fangs for the AbjadStyleTransfer shared task (ArabicNLP 2026), where we achieved 1st place. We present a contrastive style learning approach for zero-shot Arabic authorship style transfer. Our key discovery is that the 21 test authors-including Nobel laureate Naguib Mahfouz and literary pioneer Taha Hussein-have zero overlap with the 32,784 training authors, transforming this into a pure zero-shot challenge. This insight led us to develop a dual-encoder architecture that learns transferable style representations through contrastive objectives, rather than memorizing author-specific patterns. Our system achieves 19.77 BLEU and 55.74 chrF, outperforming retrieval-augmented generation (+18%) and multi-task learning (+31%). Counter-intuitively, we find that sophisticated architectural modifications like style injection consistently degrade performance, while simpler approaches that preserve pre-trained knowledge excel. Our analysis reveals that for famous authors, pre-trained Arabic language models already encode substantial stylistic knowledge-the key is surfacing it, not learning from scratch.
The rapid advancement of large language mod-els poses significant challenges for content au-thenticity, particularly in under-resourced lan-guages where detection tools remain scarce.We present our winning system for the Abjad-GenEval shared task on Arabic AI-generatedtext detection. Our key insight is that AI-generated text exhibits distinctive patternsacross multiple linguistic levels-from local syn-tax to global semantics-that can be captured bylearning to fuse representations from differenttransformer layers. We introduce aWeightedLayer Poolingmechanism that learns optimallayer combinations, combined withAttentionPoolingfor sequence-level context aggregation.Through systematic experimentation with 15+ approaches, we make a surprising discovery:model architecture selection dominates over so-phisticated training techniques, with DeBERTa-v3 providing +27% relative improvement overAraBERT regardless of training strategy. Oursystem achieves 0.93 F1-score, securing 1st placeamong all participants and outperform-ing the runner-up by 3 absolute points