Fink 2018 — Inter- and intra-observer variability of image-based PASI
Citation
Fink C, Alt C, Uhlmann L, Klose C, Enk A, Haenssle HA. Intra- and interobserver variability of image-based PASI assessments in 120 patients suffering from plaque-type psoriasis. J Eur Acad Dermatol Venereol. 2018 Aug;32(8):1314–1319. DOI: 10.1111/jdv.14960. PMID 29569769.
Study design and population
Prospective observational reliability study. 120 adults with plaque psoriasis; total-body images scored by 3 trained physicians; 720 image-based PASI scores across two rounds. University Hospital Heidelberg.
Reported metrics
- Inter-rater ICC 0.895
- Intra-rater mean ICC 0.877
- Mean absolute difference 3.3 PASI points between raters
- Sub-score variability greatest for induration and scaling; lowest for BSA
- 95 % CIs not reported
Surrogate-to-outcome linkage
Even trained physicians produce ~3-point PASI variability between raters — enough to flip treatment-escalation decisions at borderline thresholds (PASI ≥ 10 for biologic eligibility, PASI-75 vs. PASI-90 response). Manual measurement error is the clinical headroom that automated / AI scoring addresses; supports the operational claim that AI severity scoring reduces decision-threshold uncertainty.
CRIT1–7 appraisal
| Criterion | Score | Justification |
|---|---|---|
| CRIT1 Relevance | 3 | Direct — manual-PASI reliability ceiling, the denominator for AI-scoring claims. |
| CRIT2 Methodology | 2 | Prospective, single-centre, trained raters only. |
| CRIT3 Reporting | 2 | Point-estimate ICCs and MAD reported; no 95 % CIs. |
| CRIT4 Applicability | 3 | Image-based, matches the CDS modality. |
| CRIT5 Evidence weight | 1 | Methodology / validation study, prospective. |
| CRIT6 Risk of bias | 2 | Only 3 raters; single centre; all-Caucasian cohort; image-based not live PASI. |
| CRIT7 Contribution | 3 | Core anchor for the reliability-gap claim that motivates AI-assisted scoring. |
Aggregate: strong.
Limitations and notes
Small rater sample (n = 3); single centre; head region excluded; all-Caucasian cohort.
Strength as anchor
Strong — one of the most-cited modern PASI reliability studies and the direct motivation for automated scoring. Complements Gourraud 2012 (simulation-based demonstration that manual PASI crosses therapeutic thresholds between raters).