R-TF-028-010 AI V&V Checks

Table of contents

Purpose
Scope
Instructions
Checklist
Conclusion
Release Authorization
Verification

Purpose

This checklist is used to verify that the Verification & Validation Phase of the AI development lifecycle has been completed in accordance with procedure GP-028 AI Development. It ensures that the AI Development Report and AI Release Report are complete, all performance criteria have been met, and the algorithm package is ready for integration into the target software environment.

Scope

This V&V check covers all AI/ML algorithms integrated into Legit.Health Plus version 1.1.0.0, including:

Clinical Models (54 models): ICD Category Distribution (1), Visual Sign Intensity Quantification (10 models), Wound Characteristic Assessment (24 models), Lesion Quantification (5 models), Surface Quantification (12 models), Pattern Identification (2 models).
Non-Clinical Models (5 models): DIQA, Domain Validation, Skin Surface Segmentation, Body Surface Segmentation, Head Detection.

Instructions

The verifier must assess each item in the checklist. For each item, select "Yes," "No," or "N/A" and provide comments where necessary, especially for any "No" answers. All "No" items must be resolved before the V&V Phase can be considered complete and the algorithm package can be released.

Checklist

`R-TF-028-005 AI Development Report`

Check Item	Yes/No/NA	Comments
1. Is the data management process fully documented and traceable?	Yes	Data Management section provides comprehensive documentation of dataset provenance (280,342 images from 850 ICD-11 categories), composition by Fitzpatrick skin type, age, sex, and data quality verification processes.
2. Is the training methodology comprehensively described for each model?	Yes	Each model section includes detailed architecture selection rationale, hyperparameter choices, data augmentation strategies, loss functions, optimizers, and training duration with justification.
3. Are performance results presented with appropriate statistical measures (e.g., confidence intervals)?	Yes	All performance metrics include 95% confidence intervals calculated via bootstrap resampling (1000-2000 iterations). Results tables include sample sizes for transparency.
4. Do all models meet their predefined performance criteria as specified in R-TF-028-001?	Yes	All models demonstrate PASS outcomes against success criteria: ICD Distribution (Top-1 ≥50%, Top-3 ≥60%, Top-5 ≥70%), Binary Indicators (AUC ≥0.80), Visual Signs (RMAE thresholds per sign).
5. Has bias analysis been conducted across relevant subpopulations?	Yes	Comprehensive subgroup analysis performed for Fitzpatrick skin types (I-II, III-IV, V-VI), age groups (Pediatric, Adult, Geriatric), sex (Male, Female), and image type (Clinical, Dermoscopic).
6. Are bias analysis results acceptable for all subgroups?	Yes	All subgroups meet performance criteria. Minor performance variation noted for FST V-VI with ICD classification, documented with mitigation strategies for ongoing monitoring.
7. Has the test set been properly sequestered and used only for final evaluation?	Yes	Test set (12.74% of data, 35,726 images) was sequestered with patient-level separation from training/validation sets. Documentation confirms test set used only once for final evaluation.
8. Is there evidence of robustness testing under various conditions?	Yes	Robustness checks performed including rotations, brightness/contrast adjustments, zoom, and image quality variations. Domain-specific artifact simulation during training (rulers, markers, dermoscopy shadows).
9. Are explainability/interpretability methods documented where applicable?	Yes	Development Plan specifies Grad-CAM and SHAP techniques for model interpretability. Training includes bounding-box guided augmentation to ensure clinically relevant feature learning.
10. Is the model calibration process documented?	Yes	Temperature scaling for probability calibration documented per "On Calibration of Modern Neural Networks" (Guo et al., 2017). Calibration parameter included in configuration files.

`R-TF-028-006 AI Release Report`

Check Item	Yes/No/NA	Comments
1. Are all models included in the release package documented?	Yes	Complete inventory of 59 models documented with file names, version numbers, and clinical/non-clinical classification.
2. Are input/output specifications clearly defined for each model?	Yes	Detailed input specifications (image format, resolution, color space) and output specifications (data types, value ranges, JSON structures) provided for each model category.
3. Are preprocessing requirements fully specified?	Yes	Preprocessing steps documented: resize dimensions, normalization parameters (ImageNet mean/std), data type conversion, tensor format (CHW ordering).
4. Are post-processing requirements fully specified?	Yes	Post-processing documented including TTA procedures (augmentation list), temperature scaling parameters, probability calibration, and output formatting.
5. Are configuration files complete and correct?	Yes	Configuration file structures provided with model paths, versions, target shapes, normalization parameters, TTA settings, and class label references.
6. Are integration guidelines sufficient for the software development team?	Yes	Release report provides complete integration specifications with code-level details, including binary indicator calculation formula and example JSON outputs.
7. Is the model file format appropriate and documented (e.g., PyTorch version)?	Yes	PyTorch native format (.pt/.pth/.ckpt) specified with PyTorch version >=1.12. Format choice justified for optimized inference and compatibility with research infrastructure.
8. Is the Integration Verification Package documented?	Yes	Integration Verification Package specification includes reference test images, expected outputs file, verification manifest, and acceptance criteria per GP-028 requirements.

Algorithm Package Verification

Check Item	Yes/No/NA	Comments
1. Are all model files present and accessible?	Yes	All 59 PyTorch model files verified present in designated repository location with correct naming convention (model_name_vX.Y.Z.pt).
2. Are all configuration files present and valid?	Yes	JSON configuration files for all models verified present and syntactically valid. Binary indicator mapping matrix validated.
3. Do model versions match the documentation?	Yes	Model version v1.1.0.0 consistent across AI Development Report, AI Release Report, and model file naming.
4. Have the models been tested for basic functionality (inference runs without errors)?	Yes	Smoke testing performed: all models execute inference successfully on sample inputs with expected output formats.
5. Is the package versioned and stored in the designated repository?	Yes	Package version v1.1.0.0 stored in version-controlled repository with semantic versioning and full traceability to development artifacts.

Risk Assessment Integration

Check Item	Yes/No/NA	Comments
1. Has the AI Risk Matrix (R-TF-028-011) been updated based on development findings?	Yes	AI Risk Assessment updated with 29 identified risks. Residual risk distribution: 52% Acceptable (RPN ≤4), 48% Tolerable (RPN 5-9), 0% Unacceptable.
2. Are all residual risks at acceptable or tolerable levels?	Yes	All risks reduced to Acceptable or Tolerable levels. Three Tolerable risks (AI-RISK-016, AI-RISK-022, AI-RISK-026) flagged for enhanced post-market monitoring.
3. Have safety risks related to AI been communicated to the product team?	Yes	72% of AI risks (21 of 29) linked to device-level safety risks in R-TF-013-002. Traceability established to Clinical Evaluation Plan (R-TF-015-001) and Risk Management Plan (R-TF-013-001).
4. Has the benefit-risk analysis been documented?	Yes	Residual Risk Acceptability section documents benefit-risk justification: improved diagnostic accuracy, reduced inter-observer variability, enhanced clinical workflow, expanded patient access to specialist expertise.

Traceability

Check Item	Yes/No/NA	Comments
1. Is there clear traceability from requirements (R-TF-028-001) to test results?	Yes	Each model section in Development Report references corresponding specification in AI Description. Performance tables explicitly cite success criteria from R-TF-028-001.
2. Are all design decisions documented and justified?	Yes	Architecture selection, hyperparameter choices, and methodology decisions documented with rationale based on experimental comparisons and literature references.
3. Are all data sources traceable and documented?	Yes	DatasetMappingTable component provides complete traceability of all data sources. Version-controlled dataset (LegitHealth-DX) with documented ICD-11 mapping and annotation provenance.
4. Is experiment tracking documented for reproducibility?	Yes	Development Plan specifies MLflow/Weights & Biases for experiment tracking. Each trained model linked to code version, data version, and hyperparameters.
5. Are annotator qualifications and training records referenced?	Yes	Data Annotation Instructions specify annotator qualifications (board-certified dermatologists with ≥5 years experience) and multi-annotator consensus methodology.

Regulatory Compliance

Check Item	Yes/No/NA	Comments
1. Are Good Machine Learning Practice (GMLP) principles demonstrated?	Yes	GMLP principles addressed throughout: data representativeness, sequestered test sets, reference standard methodology, reproducibility, traceability, and transparent reporting.
2. Is alignment with IEC 62304 (software lifecycle) demonstrated?	Yes	AI development integrated into software lifecycle per IEC 62304. Clinical models classified as Class B. V&V activities aligned with software verification requirements.
3. Is alignment with ISO 14971 (risk management) demonstrated?	Yes	AI Risk Assessment follows ISO 14971 framework with 5×5 risk matrix, control measure priority hierarchy, and residual risk acceptability determination.
4. Is alignment with EU AI Act requirements addressed?	Yes	AI Risk Assessment scope explicitly references EU AI Act Regulation 2024/1689. High-risk AI system requirements addressed through risk management and transparency documentation.
5. Are clinical performance claims supported by validation evidence?	Yes	All performance claims in AI Description supported by validation results in Development Report with statistical confidence intervals and subgroup analysis.

Conclusion

☑ V&V Phase Approved: All checks have been successfully passed. The algorithm package is cleared for release to the software development team.

☐ V&V Phase Not Approved: One or more checks have failed. The responsible team members must address the comments and resubmit the documentation for verification.

Overall Comments:

The Verification & Validation Phase for the Legit.Health Plus v1.1.0.0 AI/ML algorithm package has been completed successfully. Key findings:

Performance Verification: All 59 AI models meet their predefined acceptance criteria as specified in R-TF-028-001 AI Description, with performance results documented with appropriate statistical rigor (95% confidence intervals).
Bias and Fairness: Comprehensive subgroup analysis demonstrates acceptable performance across demographic categories (Fitzpatrick skin types, age groups, sex). Minor performance variation for darker skin tones (FST V-VI) has been documented with appropriate mitigation through enhanced post-market surveillance.
Risk Management Integration: The AI Risk Assessment has been updated with all identified risks reduced to Acceptable or Tolerable levels. Critical risks requiring ongoing monitoring have been identified and linked to post-market surveillance activities.
Documentation Completeness: All required documentation per GP-028 has been produced, including AI Description, Development Plan, Data Collection Instructions, Data Annotation Instructions, Development Report, Release Report, and Risk Assessment.
Regulatory Alignment: The development process demonstrates compliance with MDR 2017/745, IEC 62304, ISO 14971, GMLP principles, and EU AI Act requirements for high-risk AI systems.

The algorithm package is ready for integration into the Legit.Health Plus software during GP-012 Phase 3 (Software Development).

Release Authorization

Item	Value
Package Version	v1.1.0.0
Repository Location	`s3://legit-health-plus/algorithm-packages/v1.1.0.0/`

Verification

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

Author: Team members involved
Reviewer: JD-003 Design & Development Manager, JD-004 Quality Manager & PRRC
Approver: JD-001 General Manager

ㅤㅤ ㅤㅤ

Purpose​

Scope​

Instructions​

Checklist​

R-TF-028-005 AI Development Report​

R-TF-028-006 AI Release Report​

Algorithm Package Verification​

Risk Assessment Integration​

Traceability​

Regulatory Compliance​

Conclusion​

Release Authorization​

Verification​