Fink 2018 — Inter- and intra-observer variability of image-based PASI

Citation

Fink C, Alt C, Uhlmann L, Klose C, Enk A, Haenssle HA. Intra- and interobserver variability of image-based PASI assessments in 120 patients suffering from plaque-type psoriasis. J Eur Acad Dermatol Venereol. 2018 Aug;32(8):1314–1319. DOI: 10.1111/jdv.14960. PMID 29569769.

Study design and population

Prospective observational reliability study. 120 adults with plaque psoriasis; total-body images scored by 3 trained physicians; 720 image-based PASI scores across two rounds. University Hospital Heidelberg.

Reported metrics

Inter-rater ICC 0.895
Intra-rater mean ICC 0.877
Mean absolute difference 3.3 PASI points between raters
Sub-score variability greatest for induration and scaling; lowest for BSA
95 % CIs not reported

Surrogate-to-outcome linkage

Even trained physicians produce ~3-point PASI variability between raters — enough to flip treatment-escalation decisions at borderline thresholds (PASI ≥ 10 for biologic eligibility, PASI-75 vs. PASI-90 response). Manual measurement error is the clinical headroom that automated / AI scoring addresses; supports the operational claim that AI severity scoring reduces decision-threshold uncertainty.

CRIT1–7 appraisal

Criterion	Score	Justification
CRIT1 Relevance	3	Direct — manual-PASI reliability ceiling, the denominator for AI-scoring claims.
CRIT2 Methodology	2	Prospective, single-centre, trained raters only.
CRIT3 Reporting	2	Point-estimate ICCs and MAD reported; no 95 % CIs.
CRIT4 Applicability	3	Image-based, matches the CDS modality.
CRIT5 Evidence weight	1	Methodology / validation study, prospective.
CRIT6 Risk of bias	2	Only 3 raters; single centre; all-Caucasian cohort; image-based not live PASI.
CRIT7 Contribution	3	Core anchor for the reliability-gap claim that motivates AI-assisted scoring.

Aggregate: strong.

Limitations and notes

Small rater sample (n = 3); single centre; head region excluded; all-Caucasian cohort.

Strength as anchor

Strong — one of the most-cited modern PASI reliability studies and the direct motivation for automated scoring. Complements Gourraud 2012 (simulation-based demonstration that manual PASI crosses therapeutic thresholds between raters).

Citation​

Study design and population​

Reported metrics​

Surrogate-to-outcome linkage​

CRIT1–7 appraisal​

Limitations and notes​

Strength as anchor​