Intended use covered: Provide diagnostic support for referral by returning the five most probable dermatological pathologies for an uploaded lesion image set.
Method: Functional testing with predefined datasets.
Input information:
Non-dermatological image (book photo).
Blurry lesion photo (low-quality rejection).
Oversized lesion photo (>10 MB).
Three photos of melanoma; three of benign nevus; three of psoriasis; three of healthy skin.
Objective acceptance criteria:
Images below the DIQA threshold are rejected and not analyzed; frontend shows a quality error next to each rejected image.
For analyzed requests, the response contains an aggregated probability distribution over conditions that sums to 100% across all classes; the frontend displays the top 5 only.
Displayed sensitivity and specificity match the validated metrics of the model version used; entropy is shown in [0,1].
Per-image quality score is displayed for each uploaded image.
Backend obtains a valid JWT via /login before calling /diagnosis-support; protected calls without a valid token are rejected with 401.
95th-percentile latency for the end-to-end request under nominal load is < 10 000 ms.
Minimum top-5 diagnostic coverage ≥ 95% against the ground-truth dataset.
Preconditions: Software release v1.1.0.0 frozen; accompanying Instructions for Use and Technical Description available; baseline risk file approved for this release.
Configuration: API endpoints /login and /diagnosis-support; HTTPS enforced; JSON content type; frontend does not require user login.
Upload up to three images per case (include invalid samples: non-dermatological, blurry, oversized).
Observe backend obtaining JWT via /login and invoking /diagnosis-support.
Confirm low-quality or invalid images are rejected with on-screen quality errors; valid images proceed to analysis.
Verify returned report shows:
Top-5 pathologies with probabilities; full distribution sums to 100%.
Model sensitivity, specificity, and entropy in [0,1].
Per-image quality scores.
Repeat for melanoma, nevus, psoriasis and healthy cases and compare to ground truth to compute top-5 coverage.
Record performance (p95 latency) under nominal load.
Data capture: Request/response IDs and timestamps; model and API version; JWT issuance/expiry events; per-image quality scores; probability vectors; HTTP status codes; screenshots (rejections and final reports); backend/API logs; audit log entries.
Deviation handling: Any deviation is justified and logged in the validation report.
Anomaly handling & re-validation trigger: Repeat affected parts if criteria are not met; re-validate upon changes to the API, AI model, or configuration affecting functionality or performance.