News Research Skin Cancers Diagnostics & Imaging

Comparable Melanoma Detection for AI vs Dermatologists, Systemic Review Shows

April 13, 2026 By Wendy LaGrego 4 min read
Share Share via Email Share on Facebook Share on LinkedIn Share on Twitter

AI systems perform comparably to dermatologists for detecting melanoma, and the performance of dermatologists may be boosted by AI decision support, according to the results of a systematic review and meta-analysis published in JAMA Dermatology.

The study was undertaken to address a critical gap in the literature: while prior retrospective studies have suggested that AI can match or exceed dermatologist performance, prospective evidence, which is more reflective of clinical practice, has been limited. By focusing exclusively on prospective studies, the study authors, including corresponding author Titus J. Brinker, MD, of the Division of Digital Prevention, Diagnostics and Therapy Guidance, German Cancer Research Center in Heidelberg, Germany, sought to determine whether AI is ready for routine clinical use and whether it can meaningfully augment clinician performance in melanoma diagnosis.

Study Details

The analysis included 11 prospective studies comprising more than 2,500 patients and over 50 dermatologists comprised from a systemic literature review across PubMed, Google Scholar, Embase, and Web of Science for studies published between January 1, 2000, and July 9, 2025. Eligible studies evaluated adult patients with suspected melanoma using dermoscopic images, with histopathology as the reference standard. The investigators compared three diagnostic approaches: dermatologists alone, AI alone (primarily convolutional neural network–based systems), and dermatologists assisted by AI.

Data were systematically extracted and pooled for sensitivity, specificity, accuracy, and balanced accuracy. Studies were required to include at least 20 histopathologically confirmed melanoma cases and to use prospective designs, excluding retrospective datasets and nondermoscopic imaging modalities.

Risk of bias was assessed using QUADAS-2 and QUADAS-C tools, revealing frequent concerns related to patient selection and study design, particularly the preselection of lesions suspicious for melanoma and the use of simplified binary classification systems.

Key Results

Across studies, dermatologists achieved a pooled sensitivity of 78.6% (95% confidence interval [CI] = 67.5%–88.1%) and specificity of 75.2% (95% CI = 63.3%–84.3%). AI systems demonstrated comparable performance, with sensitivity of 80.9% (95% CI = 63.6%–94.5%) and specificity of 75.6% (95% CI = 64.5%–85.6%).

The study authors suggested that the difference may be due to a cautious approach among dermatologists, whereby they are more likely to recommend a biopsy for an uncertain case. However, the use of AI could reduce the number of unnecessary biopsies completed.

In the single study evaluating AI-assisted dermatologists, performance improved further, indicating a sensitivity of 91.9% and specificity of 83.7%. Head-to-head comparisons suggested that AI may offer higher specificity with similar sensitivity, potentially reducing unnecessary biopsies. However, variability across studies was substantial, and many designs introduced bias—most notably through restricted patient populations and binary diagnostic frameworks that do not reflect real-world clinical complexity.

According to the study authors, the findings indicate that AI can achieve dermatologist-level diagnostic accuracy in prospective settings and may enhance performance when integrated into clinical workflows, but current evidence remains preliminary.

“This systematic review and meta-analysis found prospective evidence indicating that AI achieves dermatologist-level performance for dermoscopic melanoma diagnosis, with no significant differences in pooled sensitivity or specificity. This finding is encouraging for clinical translation,” the study authors concluded in their report. “Yet, the diversity of study designs, risks of bias, and the limited number of high-quality prospective datasets highlight that AI is still in the early phase of clinical validation. Larger, multicenter, and methodologically rigorous prospective studies with unselected, real-world patient populations will be essential to determine the reliability, safety, and added value of AI in routine clinical practice.”

DISCLOSURES: The study was funded by the Ministry of Health, Social Affairs and Integration Baden-Württemberg, Stuttgart, Germany. Dr. Haggenmüller reported grants from reported holding a position in research and development at the HEINE Optotechnik GmbH & Co outside the submitted work. Dr. Brinker reported ownership of a company that develops mobile apps (Smart Health Heidelberg GmbH, Heidelberg, Germany), receiving honoraria from Novartis, Roche, HEINE Optotechnik, and Merck outside the submitted work. No other disclosures were reported.

ASCO AI in Oncology is published by Conexiant under a license arrangement with the American Society of Clinical Oncology, Inc. (ASCO®). The ideas and opinions expressed in ASCO AI in Oncology do not necessarily reflect those of Conexiant or ASCO. For more information, see Policies.

KOL Commentary
Watch

Related Content