Schaap 2022 — CNN-based automated PASI scoring

Citation

Schaap MJ, Cardozo NJ, Patel A, de Jong EMGJ, van Ginneken B, Seyger MMB. Image-based automated Psoriasis Area Severity Index scoring by Convolutional Neural Networks. J Eur Acad Dermatol Venereol. 2022 Jan;36(1):68–75. DOI: 10.1111/jdv.17711. PMID 34653265.

Study design and population

Retrospective deep-learning validation of CNNs for automated PASI sub-scoring. 5,844 anonymised images from the Child-CAPTURE registry (Netherlands). Region-specific networks trained on 576 trunk, 614 arm and 541 leg image series; compared vs. real-life PASI sub-scores and 5 PASI-trained physicians.

Reported metrics

Trunk ICCs (CNN vs. real-life physician): erythema 0.616; desquamation 0.580; induration 0.580; area 0.793
Physician inter-rater ICCs (image-based): 0.706–0.793
CNN matched or outperformed image-based physician scoring on area (0.793 vs. 0.694)
Similar performance for arms and legs

Surrogate-to-outcome linkage

Demonstrates that CNN-based automated PASI achieves inter-rater agreement in the trained-physician range — the analytic-validity evidence that an AI severity-scoring device can substitute for or complement manual PASI. Anchors the claim that CDS-generated PASI outputs are valid for PASI-75 / PASI-90 treatment-response classification.

CRIT1–7 appraisal

Criterion	Score	Justification
CRIT1 Relevance	3	Direct — AI PASI, same technical modality as the intended device output.
CRIT2 Methodology	2	Retrospective, multi-region CNN design with physician comparators.
CRIT3 Reporting	2	Point-estimate ICCs per region per sub-score; no 95 % CIs.
CRIT4 Applicability	3	Image-based, matches CDS-device modality.
CRIT5 Evidence weight	1	Retrospective validation.
CRIT6 Risk of bias	2	Single-centre registry; head region excluded; single treating-physician reference for sub-scores; skin-type homogeneity.
CRIT7 Contribution	3	Central anchor — most rigorous published demonstration that AI PASI concordance matches expert panels.

Aggregate: strong.

Limitations and notes

Head region excluded; single-centre; possible skin-type homogeneity; sub-score ICCs moderate (0.58–0.62 for erythema/desquamation/induration) though area scoring excellent.

Strength as anchor

Strong — the primary modern reference demonstrating AI-PASI analytic validity. Paired with Meienberger 2020 (U-Net area segmentation) and Huang 2023 (AI PASI outperforms 43 dermatologists at sub-score level) to span the automated-scoring evidence base.

Citation​

Study design and population​

Reported metrics​

Surrogate-to-outcome linkage​

CRIT1–7 appraisal​

Limitations and notes​

Strength as anchor​