You are in:Home/Publications/REGLAT at AbjadGenEval Shared Task: Multi-Model Ensemble Approach for Arabic AI-Generated Text Detection

Assist. Ahmed Megahed :: Publications:

Title:
REGLAT at AbjadGenEval Shared Task: Multi-Model Ensemble Approach for Arabic AI-Generated Text Detection
Authors: Mariam Labib; Nsrin Ashraf; Ahmed M. Fetouh; Hamada Nayel
Year: 2026
Keywords: Generated Text Detection; NLP; Arabic
Journal: Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Volume: Not Available
Issue: Not Available
Pages: 493-496
Publisher: Association for Computational Linguistics
Local/International: International
Paper Link:
Full paper Ahmed Megahed_2026.abjadnlp-1.62.pdf
Supplementary materials Not Available
Abstract:

The rapid advancement of large language models necessitates robust methods for detecting AI-generated Arabic text. This paper presents our system for distinguishing human-written from machine-generated Arabic content. We propose a weighted ensemble combining AraBERTv2 and BERT-base-arabic, trained via 5-fold stratified cross-validation with class-balanced loss functions. Our methodology incorporates Arabic text normalization, strategic data augmentation using 16,678 samples from external scientific abstracts, and threshold optimization prioritizing recall. On the official test set, our system achieved an F1-score of 0.763, an accuracy of 0.695, a precision of 0.624, and a recall of 0.980, demonstrating strong detection of machine-generated texts with minimal false negatives at the cost of elevated false positives. Analysis reveals critical insights into precision-recall trade-offs and challenges in cross-domain generalization for Arabic AI text detection.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus