You are in:Home/Publications/AUTHORSHIP AUTHENTICATION OF POLITICAL ARABIC ARTICLES BASED ON MODIFIED TF-IGF ALGORITHM

Dr. Heba Mohamed Khalil Baioumy :: Publications:

Title:
AUTHORSHIP AUTHENTICATION OF POLITICAL ARABIC ARTICLES BASED ON MODIFIED TF-IGF ALGORITHM
Authors: HEBA M KHALIL, AHMED TAHA, TAREK EL-SHISTAWY
Year: 2020
Keywords: Not Available
Journal: Journal of Theoretical and Applied Information
Volume: Not Available
Issue: Not Available
Pages: Not Available
Publisher: Not Available
Local/International: International
Paper Link:
Full paper Not Available
Supplementary materials Not Available
Abstract:

Recently, authorship forensic analysis for political articles has become very important. It is the process in which a linguist attempts to identify the author of an anonymous text based on the vocabulary used and the linguistic style of the writer. The most existing studies of authorship forensic analysis focus on the English language, while researches concerning the Arabic language is rare. In this research, we present a new methodology that enhances authorship forensic analysis focusing on the Arabic language. The basic idea is to extract the unique vocabulary terms identifying the author (or a political group) and used for recognition of unknown authors. In the current work, a Term Frequency-Inverse Group Frequency (TFIGF) is proposed, which is a modification of the traditional TF-IDF method. Our approach is tested with large political dataset and determine the performance of Authorship forensic analysis method based on vocabulary words. The experimental results show that the average accuracy for recognizing groups has increased from 89.33% when using TF-IDF, to 92% with the proposed TF-IGF. Further improvement is achieved when representing the vocabulary terms in its Arabic lemma form, rather than its root form. The results show that the accuracy is improved from 89.33% to 92%.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus