Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
  • Legit.Health Plus Version 1.1.0.1
  • Legit.Health version 2.1 (Legacy MDD)
  • Legit.Health US Version 1.1.0.0
  • Legit.Health Utilities
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • BSI Non-Conformities
    • Technical Review
    • Clinical Review
      • Round 1
        • Item 0: Background & Action Plan
        • Item 1: CER Update Frequency
        • Item 2: Device Description & Claims
        • Item 3: Clinical Data
        • Item 4: Usability
        • Item 5: PMS Plan
        • Item 6: PMCF Plan
        • Item 7: Risk
        • completed-tasks
          • task-3b10-legacy-pms-document-hierarchy-refactor
          • task-3b11-sme-coverage-subspecialty-documentation
          • task-3b12-phase-1-exploratory-per-bucket-c-feature
          • task-3b13-man-2025-cep-cip-completeness
          • task-3b14-ifu-integration-requirements-verification
          • task-3b4-mrmc-dark-phototypes
          • task-3b6-surrogate-endpoint-literature-review
            • Appraisal log — CRIT1–7 rolling table
            • Do we need this task?
            • Integration map — propagation of the surrogate-endpoint validity review
            • references
              • diagnostic-accuracy
                • Conic 2018 — Impact of melanoma surgical timing on survival (NCDB)
                • Daneshjou 2022 — Disparities in dermatology AI performance on a diverse clinical image set (DDI) [BALANCING]
                • Dick 2019 — Accuracy of computer-aided diagnosis of melanoma: a meta-analysis
                • Esteva 2017 — Dermatologist-level classification of skin cancer with deep neural networks
                • Freeman 2020 — Algorithm-based smartphone apps for skin cancer risk: BMJ systematic review [BALANCING]
                • Gershenwald 2017 — AJCC 8th edition: melanoma staging and survival gradient
                • Haenssle 2018 — Man against machine: CNN vs 58 dermatologists for melanoma recognition
                • Haenssle 2020 — Man against machine reloaded: market-approved CNN (Moleanalyzer Pro) vs 96 dermatologists
                • Han 2018 — Clinical-image classification for benign and malignant tumours (cross-ethnicity) [BALANCING]
                • Liu 2020 — A deep learning system for differential diagnosis of skin diseases
                • Salinas 2024 — Systematic review and meta-analysis of AI vs. clinicians for skin cancer diagnosis
                • Tschandl 2020 — Human–computer collaboration for skin cancer recognition
                • Winkler 2023 — Dermatologists cooperating with a CNN: prospective clinical study
              • referral-optimisation
              • severity-assessment
            • Research prompts — external deep-research tools
            • Surrogate-Endpoint Validity in Dermatology AI — Structured Literature Review
          • task-3b7-icd-per-epidemiological-group-vv
          • task-3b8-safety-confirmation-column-definition
          • task-3b9-legacy-pms-conclusions-into-plus-pms-plan
        • Coverage matrix
        • resources
        • Task 3b-5: Autoimmune and Genodermatoses Triangulated-Evidence Package
      • Evidence rank & phases
      • Pre-submission review of R-TF-015-001 CEP and R-TF-015-003 CER
  • Pricing
  • Public tenders
  • Trainings
  • BSI Non-Conformities
  • Clinical Review
  • Round 1
  • completed-tasks
  • task-3b6-surrogate-endpoint-literature-review
  • references
  • diagnostic-accuracy
  • Haenssle 2018 — Man against machine: CNN vs 58 dermatologists for melanoma recognition

Haenssle 2018 — Man against machine: CNN vs 58 dermatologists for melanoma recognition

Citation​

Haenssle HA, Fink C, Schneiderbauer R, Toberer F, Buhl T, Blum A, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018 Aug;29(8):1836–1842. DOI: 10.1093/annonc/mdy166. PMID 29846502.

Study design and population​

Pre-registered (DRKS00013570) cross-sectional comparative reader study. Google Inception-v4 CNN vs. 58 international dermatologists (17 countries; 30 experts) on a 100-image dermoscopic test set at two information levels (Level I — dermoscopy only; Level II — dermoscopy + clinical context). Enriched melanoma prevalence (~20 %).

Reported metrics​

  • Level I dermatologists — sensitivity 86.6 % ± 9.3 SD; specificity 71.3 % ± 11.2 SD
  • Level II dermatologists — sensitivity 88.9 %; specificity 75.7 %
  • CNN ROC-AUC 0.86 vs. dermatologist mean ROC-AUC 0.79 (p < 0.01)
  • At matched dermatologist sensitivity, CNN specificity 82.5 % vs. dermatologist 71.3 % (p < 0.01)
  • 95 % CIs not reported (SDs only)

Surrogate-to-outcome linkage​

Higher specificity at equal sensitivity translates directly to fewer unnecessary benign excisions while preserving melanoma detection — i.e., improved "appropriate biopsy rate". Higher sensitivity at equal specificity reduces missed melanomas, the proximal mechanism feeding the stage-at-detection → melanoma-specific survival gradient.

CRIT1–7 appraisal​

CriterionScoreJustification
CRIT1 Relevance3Direct dermoscopic melanoma classification; surrogate domain diagnostic accuracy.
CRIT2 Methodology2Pre-registered; 58-dermatologist comparator cohort; two information-level design; reference standard histopathology.
CRIT3 Reporting2Operating points and AUCs reported; no parametric CIs.
CRIT4 Applicability2Matches intended use (clinician + device). Phototype distribution not reported.
CRIT5 Evidence weight2Large prospective pre-registered multi-reader study (not RCT, not meta-analysis).
CRIT6 Risk of bias2Enriched melanoma prevalence; 100-image artificial test set; possible post-hoc operating-point selection.
CRIT7 Contribution3Core anchor — CNN outperforms expert dermatologists on specificity; links directly to unnecessary-biopsy reduction.

Aggregate: strong.

Limitations and notes​

Artificial reading environment; enriched prevalence; methodology critique on operating-point selection published elsewhere; Fitzpatrick distribution unreported.

Strength as anchor​

Strong for the directional claim (improved accuracy → appropriate-biopsy outcome). Regulator-familiar landmark reference; complements the quantitative AJCC anchor.

Previous
Gershenwald 2017 — AJCC 8th edition: melanoma staging and survival gradient
Next
Haenssle 2020 — Man against machine reloaded: market-approved CNN (Moleanalyzer Pro) vs 96 dermatologists
  • Citation
  • Study design and population
  • Reported metrics
  • Surrogate-to-outcome linkage
  • CRIT1–7 appraisal
  • Limitations and notes
  • Strength as anchor
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI Labs Group S.L.)