Winkler 2023 — Dermatologists cooperating with a CNN: prospective clinical study
Citation
Winkler JK, Blum A, Kommoss K, Enk A, Toberer F, Rosenberger A, Haenssle HA. Assessment of Diagnostic Performance of Dermatologists Cooperating With a Convolutional Neural Network in a Prospective Clinical Study: Human With Machine. JAMA Dermatol. 2023 Jun 1;159(6):621–627. DOI: 10.1001/jamadermatol.2023.0905. PMID 37133847.
Study design and population
Prospective two-centre clinical study. 22 dermatologists evaluated 228 suspect melanocytic lesions with and without market-approved CNN support (Moleanalyzer Pro, FotoFinder). Histopathological reference available for 54.8 % of lesions.
Reported metrics
- Sensitivity: dermatologist alone 84.2 % (95 % CI 69.6–92.6) → with CNN 100.0 % (95 % CI 90.8–100.0); p = 0.03
- Specificity: 72.1 % → 83.7 %; p < 0.001
- ROC AUC: 0.895 (95 % CI 0.836–0.954) → 0.968 (95 % CI 0.948–0.988); p = 0.005
- CNN guidance reduced unnecessary excision of benign nevi by 19.2 %
Surrogate-to-outcome linkage
Prospective, real-world evidence that AI-assisted diagnostic accuracy translates into a measurable reduction in unnecessary procedures (19.2 % fewer benign excisions) while simultaneously eliminating missed melanomas. Closes the loop from accuracy surrogate to the patient-relevant iatrogenic-harm outcome.
CRIT1–7 appraisal
| Criterion | Score | Justification |
|---|---|---|
| CRIT1 Relevance | 3 | Prospective clinical study of CE-marked CNN in the intended clinician-supervised workflow. |
| CRIT2 Methodology | 3 | Prospective, two-centre; within-subject before-after design; histopathology reference. |
| CRIT3 Reporting | 3 | Sensitivity, specificity, AUC with 95 % CIs and p-values reported. |
| CRIT4 Applicability | 3 | Direct match — dermatologist + CNN in real clinical workflow. |
| CRIT5 Evidence weight | 2 | Prospective clinical study (not RCT, not meta-analysis). |
| CRIT6 Risk of bias | 2 | Within-subject design; two-centre; histopathology available for 54.8 % only. |
| CRIT7 Contribution | 3 | Core anchor — links accuracy uplift to reduced benign excisions, a patient-relevant outcome. |
Aggregate: very strong.
Limitations and notes
Two-centre design; histopathology partial; industry-affiliated device developer in author list.
Strength as anchor
Very strong — one of the few prospective real-world studies quantifying the patient-relevant outcome (avoided benign excisions) downstream of AI-supported accuracy. Complements Tschandl 2020 (simulated reader) with real-deployment evidence.