You are in:Home/Publications/Character N-gram model for toxicity prediction

Dr. Hamada Ali Mohamed Ali Nayel :: Publications:

Title:
Character N-gram model for toxicity prediction
Authors: Eman Shehab; Hamada Nayel; Mohamed Taha
Year: 2024
Keywords: Feature extraction; Machine learning; Molecular toxicity prediction; N-gram; Simplified molecular-input line-entry system
Journal: IAES International Journal of Artificial Intelligence (IJ-AI)
Volume: 13
Issue: 4
Pages: 4380-4387
Publisher: IAES
Local/International: International
Paper Link:
Full paper Hamada Ali Mohamed Ali Nayel_document-1.pdf
Supplementary materials Not Available
Abstract:

Molecular toxicity prediction is a crucial step in the drug discovery process. It has a direct relationship with human health and medical destiny. Accurately assessing a molecule’s toxicity can aid in the weeding out of low-quality compounds early in the drug discovery phase, avoiding depletion later in the drug development process. Computational models have been used automatically for molecular toxicity prediction. In this paper, a machine learning-based model has been proposed. TF/IDF representation scheme has been used for N-gram and integrated with simplified molecular-input line-entry system (SMILES). Multiple machine learning classifiers such as logistic regression (LR), support vector machine (SVM), random forest (RF), decision tree (DT), k-nearest neighbors (KNN), AdaBoost, multi-layer perceptron (MLP), and stochastic gradient descent (SGD) classifiers have been implemented. A wide range of N-gram models have been implemented and trigram reported the best results. RF and SVM achieved 85% and 84% accuracy respectively. Comparable to state-of-the-art models, our results are acceptable as we used minimum available resources.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus