Schaap 2022 — CNN-based automated PASI scoring
Citation
Schaap MJ, Cardozo NJ, Patel A, de Jong EMGJ, van Ginneken B, Seyger MMB. Image-based automated Psoriasis Area Severity Index scoring by Convolutional Neural Networks. J Eur Acad Dermatol Venereol. 2022 Jan;36(1):68–75. DOI: 10.1111/jdv.17711. PMID 34653265.
Study design and population
Retrospective deep-learning validation of CNNs for automated PASI sub-scoring. 5,844 anonymised images from the Child-CAPTURE registry (Netherlands). Region-specific networks trained on 576 trunk, 614 arm and 541 leg image series; compared vs. real-life PASI sub-scores and 5 PASI-trained physicians.
Reported metrics
- Trunk ICCs (CNN vs. real-life physician): erythema 0.616; desquamation 0.580; induration 0.580; area 0.793
- Physician inter-rater ICCs (image-based): 0.706–0.793
- CNN matched or outperformed image-based physician scoring on area (0.793 vs. 0.694)
- Similar performance for arms and legs
Surrogate-to-outcome linkage
Demonstrates that CNN-based automated PASI achieves inter-rater agreement in the trained-physician range — the analytic-validity evidence that an AI severity-scoring device can substitute for or complement manual PASI. Anchors the claim that CDS-generated PASI outputs are valid for PASI-75 / PASI-90 treatment-response classification.
CRIT1–7 appraisal
| Criterion | Score | Justification |
|---|---|---|
| CRIT1 Relevance | 3 | Direct — AI PASI, same technical modality as the intended device output. |
| CRIT2 Methodology | 2 | Retrospective, multi-region CNN design with physician comparators. |
| CRIT3 Reporting | 2 | Point-estimate ICCs per region per sub-score; no 95 % CIs. |
| CRIT4 Applicability | 3 | Image-based, matches CDS-device modality. |
| CRIT5 Evidence weight | 1 | Retrospective validation. |
| CRIT6 Risk of bias | 2 | Single-centre registry; head region excluded; single treating-physician reference for sub-scores; skin-type homogeneity. |
| CRIT7 Contribution | 3 | Central anchor — most rigorous published demonstration that AI PASI concordance matches expert panels. |
Aggregate: strong.
Limitations and notes
Head region excluded; single-centre; possible skin-type homogeneity; sub-score ICCs moderate (0.58–0.62 for erythema/desquamation/induration) though area scoring excellent.
Strength as anchor
Strong — the primary modern reference demonstrating AI-PASI analytic validity. Paired with Meienberger 2020 (U-Net area segmentation) and Huang 2023 (AI PASI outperforms 43 dermatologists at sub-score level) to span the automated-scoring evidence base.