Eslam Farid Elsayed Abu Shady|Publications:Comparative diagnostic accuracy of ChatGPT models in salivary gland disease: a multimodal vignette-based evaluation

You are in:Home/Publications/Comparative diagnostic accuracy of ChatGPT models in salivary gland disease: a multimodal vignette-based evaluation
Dr. Eslam Farid Elsayed Abu Shady :: Publications:

Title:	Comparative diagnostic accuracy of ChatGPT models in salivary gland disease: a multimodal vignette-based evaluation
Authors:	Asmaa Abou-Bakr, Ahmed Adel Eissa, Basma Alshikh, Yousra Ahmed, Eslam Farid AbuShady, Melek Tassoker & Fatma E. A. Hassanein
Year:	2025
Keywords:	Not Available
Journal:	European archives of Oto-rhino-laryngology
Volume:	Not Available
Issue:	Not Available
Pages:	Not Available
Publisher:	Springer nature
Local/International:	International
Paper Link:	https://link.springer.com/article/10.1007/s00405-025-09925-5
Full paper	Not Available
Supplementary materials	Not Available

Abstract:

Background This study evaluated the diagnostic accuracy and consistency of ChatGPT-4o in salivary gland disorders compared to experienced clinicians. Methods Eighty anonymized salivary gland cases from peer-reviewed reports were evaluated by ChatGPT-4o using standardized multimodal prompts and by three oral medicine specialists who provided Top-5 differentials. The primary outcome was diagnostic accuracy at the most likely diagnosis (Top-1), within the top three (Top-3), and within the top five (Top-5) differential diagnoses, with agreement measured by Cohen’s kappa and subgroup analyses by gland type, imaging, and case difficulty. Results At Top-3 and Top-5, ChatGPT showed perfect sensitivity (100%) and Top-1 86.67%. Experts surpassed ChatGPT at Top-5 (77.5% vs. 67.5%, p

Dr. Eslam Farid Elsayed Abu Shady :: Publications: