Dick 2019 — Accuracy of computer-aided diagnosis of melanoma: a meta-analysis
Citation
Dick V, Sinz C, Mittlböck M, Kittler H, Tschandl P. Accuracy of computer-aided diagnosis of melanoma: a meta-analysis. JAMA Dermatol. 2019 Nov 1;155(11):1291–1299. DOI: 10.1001/jamadermatol.2019.1375. PMID 31215969.
Study design and population
Systematic review and bivariate random-effects meta-analysis of 132 computer-aided melanoma diagnosis studies (January 2002 – December 2018); 70 studies quantitatively pooled.
Reported metrics
- Pooled CAD melanoma sensitivity 0.74 (95 % CI 0.66–0.80)
- Pooled CAD melanoma specificity 0.84 (95 % CI 0.79–0.88)
- Independent test sets: sensitivity 0.51 (95 % CI 0.34–0.69) vs. non-independent 0.82 (95 % CI 0.77–0.86), p < 0.001
- CAD performance approximately equivalent to dermatologist sensitivity; ~10 pp lower specificity (non-significant)
Surrogate-to-outcome linkage
Highest-weight aggregate evidence that AI-derived diagnostic accuracy (sensitivity/specificity) is the accepted endpoint class for melanoma recognition in dermatology AI. The independent-test-set drop quantifies the external-validity gap that directly motivates PMCF performance monitoring under the EU MDR.
CRIT1–7 appraisal
| Criterion | Score | Justification |
|---|---|---|
| CRIT1 Relevance | 3 | Direct — CAD melanoma diagnostic accuracy. |
| CRIT2 Methodology | 3 | Systematic review with bivariate random-effects meta-analysis; independent-vs-non-independent test-set stratification. |
| CRIT3 Reporting | 3 | Pooled estimates with 95 % CIs; spectrum-bias effect quantified. |
| CRIT4 Applicability | 2 | Aggregates predominantly curated-dataset studies; generalisability to primary-care populations limited. |
| CRIT5 Evidence weight | 3 | Meta-analysis — highest tier. |
| CRIT6 Risk of bias | 2 | High heterogeneity across primary studies; publication bias likely; limited phototype diversity in included studies. |
| CRIT7 Contribution | 3 | Core aggregate anchor for the accepted-surrogate claim; quantifies the independent-test-set gap as a declared PMCF target. |
Aggregate: very strong.
Limitations and notes
Heterogeneous primary studies; many from computer-science literature rather than clinical dermatology; publication bias; homogeneous phototype coverage.
Strength as anchor
Very strong — the highest-tier aggregate evidence in the domain. Used as the regulator-facing weight for the accepted-surrogate claim and the motivating citation for the phototype-bias + PMCF-performance-monitoring narrative.