| You are in:Home/Publications/Arabic Regional Dialect Identification (ARDI) using Pair of Continuous Bag-of-Words and Data Augmentation | |
Dr. Hamada Ali Mohamed Ali Nayel :: Publications: |
|
| Title: | Arabic Regional Dialect Identification (ARDI) using Pair of Continuous Bag-of-Words and Data Augmentation |
| Authors: | Ahmed H AbuElAtta; Mahmoud Sobhy; Ahmed A El-Sawy; Hamada Nayel |
| Year: | 2023 |
| Keywords: | Author Profiling; NLP; Machine Learning; Arabic NLP |
| Journal: | International Journal of Advanced Computer Science and Applications |
| Volume: | 14 |
| Issue: | 11 |
| Pages: | 258-264 |
| Publisher: | The Science and Information Organization |
| Local/International: | International |
| Paper Link: | |
| Full paper | Hamada Ali Mohamed Ali Nayel_Arabic_Regional_Dialect_Identi.pdf |
| Supplementary materials | Not Available |
| Abstract: |
Author profiling is the process of finding characteristics that make up an author’s profile. This paper presents a machine learning-based author profiling model for Arabic users, considering the author’s regional dialect as a crucial characteristic. Various classification algorithms have been implemented: decision tree, KNN, multilayer perceptron, random forest, and support vector machines. A pair of Continuous Bag-of-Word (CBOW) models has been used for word representation. A well-known data set has been used to evaluate the proposed model and a data augmentation process has been implemented to improve the quality of training data. Support vector machines achieved a 50.52% f1-score, outperforming other models. |















