You are in:Home/Publications/Comparative diagnostic accuracy of ChatGPT models in salivary gland disease: a multimodal vignette-based evaluation

Dr. Eslam Farid Elsayed Abu Shady :: Publications:

Title:
Comparative diagnostic accuracy of ChatGPT models in salivary gland disease: a multimodal vignette-based evaluation
Authors: Asmaa Abou-Bakr, Ahmed Adel Eissa, Basma Alshikh, Yousra Ahmed, Eslam Farid AbuShady, Melek Tassoker & Fatma E. A. Hassanein
Year: 2025
Keywords: Not Available
Journal: European archives of Oto-rhino-laryngology
Volume: Not Available
Issue: Not Available
Pages: Not Available
Publisher: Springer nature
Local/International: International
Paper Link:
Full paper Not Available
Supplementary materials Not Available
Abstract:

Background This study evaluated the diagnostic accuracy and consistency of ChatGPT-4o in salivary gland disorders compared to experienced clinicians. Methods Eighty anonymized salivary gland cases from peer-reviewed reports were evaluated by ChatGPT-4o using standardized multimodal prompts and by three oral medicine specialists who provided Top-5 differentials. The primary outcome was diagnostic accuracy at the most likely diagnosis (Top-1), within the top three (Top-3), and within the top five (Top-5) differential diagnoses, with agreement measured by Cohen’s kappa and subgroup analyses by gland type, imaging, and case difficulty. Results At Top-3 and Top-5, ChatGPT showed perfect sensitivity (100%) and Top-1 86.67%. Experts surpassed ChatGPT at Top-5 (77.5% vs. 67.5%, p 

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus