Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
  • Legit.Health Plus Version 1.1.0.1
  • Legit.Health version 2.1 (Legacy MDD)
  • Legit.Health US Version 1.1.0.0
  • Legit.Health Utilities
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • BSI Non-Conformities
    • Technical Review
    • Clinical Review
      • Round 1
        • Item 0: Background & Action Plan
        • Item 1: CER Update Frequency
        • Item 2: Device Description & Claims
        • Item 3: Clinical Data
        • Item 4: Usability
        • Item 5: PMS Plan
        • Item 6: PMCF Plan
        • Item 7: Risk
        • completed-tasks
          • task-3b10-legacy-pms-document-hierarchy-refactor
          • task-3b11-sme-coverage-subspecialty-documentation
          • task-3b12-phase-1-exploratory-per-bucket-c-feature
          • task-3b13-man-2025-cep-cip-completeness
          • task-3b14-ifu-integration-requirements-verification
          • task-3b4-mrmc-dark-phototypes
          • task-3b6-surrogate-endpoint-literature-review
            • Appraisal log — CRIT1–7 rolling table
            • Do we need this task?
            • Integration map — propagation of the surrogate-endpoint validity review
            • references
              • diagnostic-accuracy
                • Conic 2018 — Impact of melanoma surgical timing on survival (NCDB)
                • Daneshjou 2022 — Disparities in dermatology AI performance on a diverse clinical image set (DDI) [BALANCING]
                • Dick 2019 — Accuracy of computer-aided diagnosis of melanoma: a meta-analysis
                • Esteva 2017 — Dermatologist-level classification of skin cancer with deep neural networks
                • Freeman 2020 — Algorithm-based smartphone apps for skin cancer risk: BMJ systematic review [BALANCING]
                • Gershenwald 2017 — AJCC 8th edition: melanoma staging and survival gradient
                • Haenssle 2018 — Man against machine: CNN vs 58 dermatologists for melanoma recognition
                • Haenssle 2020 — Man against machine reloaded: market-approved CNN (Moleanalyzer Pro) vs 96 dermatologists
                • Han 2018 — Clinical-image classification for benign and malignant tumours (cross-ethnicity) [BALANCING]
                • Liu 2020 — A deep learning system for differential diagnosis of skin diseases
                • Salinas 2024 — Systematic review and meta-analysis of AI vs. clinicians for skin cancer diagnosis
                • Tschandl 2020 — Human–computer collaboration for skin cancer recognition
                • Winkler 2023 — Dermatologists cooperating with a CNN: prospective clinical study
              • referral-optimisation
              • severity-assessment
            • Research prompts — external deep-research tools
            • Surrogate-Endpoint Validity in Dermatology AI — Structured Literature Review
          • task-3b7-icd-per-epidemiological-group-vv
          • task-3b8-safety-confirmation-column-definition
          • task-3b9-legacy-pms-conclusions-into-plus-pms-plan
        • Coverage matrix
        • resources
        • Task 3b-5: Autoimmune and Genodermatoses Triangulated-Evidence Package
      • Evidence rank & phases
      • Pre-submission review of R-TF-015-001 CEP and R-TF-015-003 CER
  • Pricing
  • Public tenders
  • Trainings
  • BSI Non-Conformities
  • Clinical Review
  • Round 1
  • completed-tasks
  • task-3b6-surrogate-endpoint-literature-review
  • references
  • diagnostic-accuracy
  • Tschandl 2020 — Human–computer collaboration for skin cancer recognition

Tschandl 2020 — Human–computer collaboration for skin cancer recognition

Citation​

Tschandl P, Rinner C, Apalla Z, Argenziano G, Codella N, Halpern A, et al. Human–computer collaboration for skin cancer recognition. Nat Med. 2020 Aug;26(8):1229–1234. DOI: 10.1038/s41591-020-0942-0. PMID 32572267.

Study design and population​

Pre-registered, international web-based reader study of three AI decision-support formats. 302 physicians (169 board-certified dermatologists, 77 residents, 38 GPs, 18 other) classified 1,511 dermoscopic images across 7 diagnostic categories, with and without AI support. HAM10000-derived test set.

Reported metrics​

  • Unaided multiclass accuracy 63.6 % (95 % CI 62.6–64.5)
  • AI-assisted (multiclass probability) accuracy 77.0 % (95 % CI 76.2–77.9)
  • Absolute uplift +13.3 pp (p < 0.001); largest gain among least-experienced clinicians
  • Identified a safety hazard: faulty AI output can mislead even expert clinicians

Surrogate-to-outcome linkage​

Directly evidences that AI decision support improves clinician classification accuracy — the exact operational claim of a Class IIb dermatology CDS. The magnitude of uplift (~+13 pp, largest for non-specialists) maps onto the device's intended benefit of reducing diagnostic error at the primary-care / teledermatology triage step, on the causal path to earlier appropriate treatment.

CRIT1–7 appraisal​

CriterionScoreJustification
CRIT1 Relevance3Direct match — clinician + AI decision-support workflow on dermoscopic classification.
CRIT2 Methodology3Large, international, pre-registered design with multiple AI-support formats; 302 physicians; reference standard histopathology/consensus.
CRIT3 Reporting3Accuracy with 95 % CIs reported; safety-hazard identification documented.
CRIT4 Applicability3Workflow analogous to CDS use; tested across dermatologist / resident / GP tiers.
CRIT5 Evidence weight2Large prospective reader study (not RCT, not meta-analysis).
CRIT6 Risk of bias2Simulation, not deployment; HAM10000 phototype skew; documented automation-bias risk with faulty AI output.
CRIT7 Contribution3Core anchor for the directional claim — AI support translates to classification improvement, with effect size quantified.

Aggregate: very strong.

Limitations and notes​

Simulated reader workflow, not real deployment; HAM10000 underlying phototype imbalance; safety hazard of faulty AI output explicitly characterised (feature, not flaw — belongs in the risk-management narrative).

Strength as anchor​

Very strong for directional + operational mechanism claim. Reports 95 % CIs, making it methodologically robust. The documented "faulty AI can mislead experts" finding also supports the integrator-responsibility language in the CER.

Previous
Salinas 2024 — Systematic review and meta-analysis of AI vs. clinicians for skin cancer diagnosis
Next
Winkler 2023 — Dermatologists cooperating with a CNN: prospective clinical study
  • Citation
  • Study design and population
  • Reported metrics
  • Surrogate-to-outcome linkage
  • CRIT1–7 appraisal
  • Limitations and notes
  • Strength as anchor
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI Labs Group S.L.)