Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
    • Index of Technical Documentation or Product File
    • Summary of Technical Documentation (STED)
    • Description and specifications
    • R-TF-001-007 Declaration of conformity
    • GSPR
    • Artificial Intelligence
      • R-TF-028-001 AI/ML Description
      • R-TF-028-001 AI/ML Development Plan
      • R-TF-028-003 Data Collection Instructions - Prospective Data
      • R-TF-028-003 Data Collection Instructions - Retrospective Data
      • R-TF-028-004 Data Annotation Instructions - Visual Signs
      • R-TF-028-004 Data Annotation Instructions - Binary Indicator Mapping
      • R-TF-028-004 AI/ML Development Report
      • R-TF-028 AI/ML Release Report
      • R-TF-028 AI/ML Design Checks
    • Clinical
    • Cybersecurity
    • Design and development
    • Design History File
    • IFU and label
    • Post-Market Surveillance
    • Quality control
    • Risk Management
    • Usability and Human Factors Engineering
  • Legit.Health Plus Version 1.1.0.1
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • Public tenders
  • Legit.Health Plus Version 1.1.0.0
  • Artificial Intelligence
  • R-TF-028-001 AI/ML Development Plan

R-TF-028-001 AI/ML Development Plan

Table of contents
  • Abbreviations
  • Introduction
    • Context
    • Objectives
  • Team
  • Project Management
    • Meetings
    • Management Tools
    • Project Planning
  • Environment
    • Development Tools
    • Development Software
    • Development Environment
  • AI/ML Development Plan
    • Development Cycle
    • Development Specifications
    • Development Steps
  • Data Management Plan
    • Good Practices
      • Data Collection & Curation
      • Data Quality & Integrity
      • Ground Truth Determination
      • Sequestration of Test Data
    • Working Plan
  • Training & Evaluation Plan
    • Good Practices
      • Reproducibility and Traceability
      • Model Design & Selection
      • Model Training & Tuning
      • Model Evaluation & Validation
    • Working Plan
  • Release Plan
    • Good Practices
    • Working Plan
    • Deliverables
      • Documentation
      • Algorithm Package
  • AI/ML Risk Management Plan
    • AI/ML Risk Management Process
    • AI/ML Risk Ranking System
      • Severity
      • Likelihood
      • AI/ML Risk Priority Number and Acceptability
    • Safety Risks Related to AI/ML

Abbreviations​

TermDefinition
AI/MLArtificial Intelligence / Machine Learning
AUCArea Under the Receiver Operating Characteristic Curve
GDPRGeneral Data Protection Regulation
GMLPGood Machine Learning Practice
ICDInternational Classification of Diseases
ONNXOpen Neural Network Exchange
QMSQuality Management System
RPNRisk Priority Number
ViTVision Transformer
XAIExplainable Artificial Intelligence

Introduction​

Context​

Legit.Health Plus provides advanced Clinical Decision Support (CDS) through AI/ML algorithms designed to assist qualified healthcare professionals in the assessment of dermatological conditions. The algorithms analyze clinical and dermoscopic images of skin lesions to generate objective, data-driven insights. It is critical to note that the device is intended to augment, not replace, the clinical judgment of a healthcare professional.

The core AI/ML functionality is delivered through two algorithm types:

  • An ICD Category Distribution Algorithm: A multiclass classification model that processes a lesion image and outputs a ranked probability distribution across relevant ICD-11 categories, presenting the top five differential diagnoses.
  • Binary Indicator Algorithms: Derived from the primary model's output, these algorithms provide three discrete indicators for case prioritization: Malignancy, Dermatological Condition, and Critical Complexity.

Objectives​

The primary objectives of this development plan are to:

  • Develop a robust ICD Category Distribution algorithm to assist clinicians in formulating a differential diagnosis, thereby enhancing diagnostic accuracy and efficiency, while meeting the performance endpoints specified in R-TF-028-001.
  • Develop three highly performant Binary Indicator algorithms to provide clear, actionable signals for clinical workflow prioritization, meeting the AUC thresholds defined in R-TF-028-001.
  • Ensure the entire development lifecycle adheres to the company's QMS, GMLP principles, and applicable regulations (MDR 2017/745, ISO 13485) to deliver safe and effective algorithms.

Team​

RoleDescription And ResponsibilitiesPerson(s)
Technical ManagerOverall management of team planning and resources. Ensuring alignment with QMS procedures. Application of this procedure.Alfonso Medela
Design & Development ManagerManages the design and development lifecycle, including verification and validation activities in accordance with GP-012.Taig Mac Carthy
AI TeamDevelops, validates, and maintains the AI/ML algorithms. Responsible for data management, training, evaluation, and release processes.

Project Management​

Meetings​

  • Sprint Meetings: The project follows an Agile framework with 2-week sprints. Bi-weekly meetings are held for sprint review, retrospective analysis, and planning.
  • Daily Stand-ups: The AI team conducts daily stand-up meetings to synchronize progress, address impediments, and align on daily priorities.
  • Technical Reviews: Bi-weekly or monthly meetings are held to present key R&D findings, review model architectures, and discuss experimental results with cross-functional stakeholders.

Management Tools​

ToolDescription
JiraTo manage the product backlog, plan sprints, and track all tasks, bugs, and user stories with full traceability.
GitHub Central repository for technical documentation, design specifications, meeting minutes, and sprint reports.

Project Planning​

The Technical Manager is responsible for the overall project planning and monitoring, ensuring that development milestones align with the product roadmap and regulatory timelines.

Environment​

Development Tools​

ToolDescription
Bitbucket / GitFor rigorous version control of all source code, models, and critical configuration files. Enforces peer review via pull requests.
DockerTo create containerized, reproducible environments, ensuring consistency between development, testing, and deployment.
MLflow / Weights & BiasesFor systematic tracking of experiments, including parameters, metrics, code versions, and model artifacts, ensuring full reproducibility.

Development Software​

SoftwareDescription
Python >=3.9Primary programming language.
TensorFlow >=2.10 / PyTorch >=1.12State-of-the-art deep learning frameworks.
CUDA / cuDNNNVIDIA libraries for GPU acceleration.
NumPy, Pandas, Scikit-learn, OpenCVCore libraries for data manipulation, image processing, and performance evaluation.
Flake8 / Black / MyPy / PytestA suite of tools to enforce code quality, style, type safety, and correctness through automated testing.

Development Environment​

AI/ML development is conducted on a secure, high-performance computing infrastructure.

EnvironmentDescription
Research Server (Ubuntu 22.04 LTS)Primary environment for model training, evaluation, and experiment management.
DatabasePostgreSQL instance for structured storage of annotations and metadata.
Data StorageSecure, access-controlled cloud storage (e.g., AWS S3, Google Cloud Storage) for medical images.

Research Server Minimum Requirements:

  • OS: Ubuntu 22.04 LTS or higher
  • GPU: NVIDIA A100 or H100 (or equivalent) with >= 40 GB VRAM
  • CPU: `>= 32 cores @ \>= 2.5` GHz
  • RAM: `>= 128` GB
  • Storage: `>= 5` TB of high-speed NVMe SSD storage

AI/ML Development Plan​

Development Cycle​

The AI/ML development adheres to the three-phase cycle mandated by procedure GP-028 AI Development, ensuring a structured progression from design to release.

Development Specifications​

All development is strictly governed by the specifications in R-TF-028-001 AI/ML Description. This document serves as the primary input for design and defines the acceptance criteria for V&V.

Development Steps​

  1. Data Management: Sourcing, curating, annotating, and partitioning data according to GMLP.
  2. Training & Evaluation: Building, training, tuning, and rigorously evaluating models.
  3. Release (V&V): Finalizing, documenting, and packaging the model for software integration.

Data Management Plan​

Good Practices​

Data Collection & Curation​

  • Representativeness: In line with GMLP principles, data is collected to be highly representative of the intended patient population. Active measures are taken to ensure diversity across age, sex, and all six Fitzpatrick skin phototypes to promote equitable performance.
  • Protocols: Data acquisition follows the detailed clinical and technical requirements in R-TF-028-003, ensuring consistency in image quality.
  • Compliance: All data processing is fully compliant with GDPR. Data is de-identified at the source, and robust data protection impact assessments are conducted.

Data Quality & Integrity​

  • Annotation: Data is labeled by qualified dermatologists following R-TF-028-004. Critical labels are subject to a multi-annotator review process to ensure high quality and consistency.
  • Traceability: Data is managed using version-controlled snapshots. Each snapshot is an immutable, timestamped collection of data and labels, ensuring a complete audit trail from data to the final model.

Ground Truth Determination​

  • Methodology: The ground truth for diagnoses is established by a panel of at least three board-certified dermatologists. Discrepancies are resolved by a senior reviewer or through histopathological correlation where available and clinically appropriate. This robust process minimizes label noise and ensures a high-fidelity reference standard.

Sequestration of Test Data​

  • Partitioning: The dataset is partitioned at the patient level into training, validation, and test sets. This strict separation is critical to prevent data leakage and ensure that the final performance evaluation is unbiased.
  • Shielding: The test set is a sequestered, held-out dataset used only once for the final, unbiased evaluation of the selected model. It is never used for training, tuning, or model selection.

Working Plan​

  1. Data is collected, de-identified, and securely stored.
  2. Data is annotated according to the defined multi-stage review process.
  3. A versioned data snapshot is created and frozen.
  4. The snapshot is split by patient ID into training, validation, and test sets. The test set is immediately sequestered.
  5. The snapshot version and split definitions are logged for full reproducibility.

Training & Evaluation Plan​

Good Practices​

Reproducibility and Traceability​

  • Versioning: Every component is versioned: Git for code, DVC for data, and MLflow for experiments. Each trained model is linked to the exact code, data, and hyperparameters used to create it.

Model Design & Selection​

  • Architecture: Model selection is informed by a systematic review of state-of-the-art architectures (e.g., ViT, ConvNeXt, EfficientNetV2).
  • Hyperparameter Optimization: A structured approach (e.g., Bayesian optimization or grid search) is used to find the optimal set of hyperparameters.

Model Training & Tuning​

  • Augmentation: A rich set of data augmentation techniques is used to improve generalization, including geometric transformations (rotation, scaling, flipping) and photometric distortions (brightness, contrast, color jitter) that reflect real-world variability.
  • Overfitting Mitigation: In addition to augmentation, techniques like dropout, weight decay, and early stopping are employed to ensure models generalize well to unseen data.
  • Model Calibration: Post-training calibration techniques (e.g., temperature scaling) are applied to ensure that the model's output probabilities are reliable and well-calibrated, meaning a predicted 80% confidence accurately reflects an 80% likelihood of correctness.

Model Evaluation & Validation​

  • Robustness Analysis: Performance is evaluated not just on aggregate metrics but also across key patient subgroups (e.g., by skin phototype, age, sex) to proactively identify and mitigate potential biases.
  • Explainability (XAI): During development, XAI techniques (e.g., Grad-CAM, SHAP) are used to visualize and understand the model's decision-making process. This helps verify that the model is learning clinically relevant features and not relying on confounding artifacts.
  • Statistical Rigor: All key performance metrics are reported with 95% confidence intervals to accurately represent statistical uncertainty.

Working Plan​

  1. A model configuration file specifies all parameters for a training run.
  2. The model is trained, with all metrics and artifacts logged in real-time to MLflow.
  3. A uniquely identified model package is generated, containing the model, its configuration, and training history.
  4. A final, comprehensive evaluation is performed on the held-out test set, with results and explainability analyses compiled into the final performance report.

Release Plan​

Good Practices​

  • Equivalence Testing: Models are converted to a high-performance format (e.g., ONNX). Rigorous tests are run to verify near-identical numerical output between the original and converted models.
  • Comprehensive Reporting: The AI/ML Development Report (R-TF-028-005) provides a complete account of the development and V&V process, serving as objective evidence that the model is safe and effective.
  • Clear Instructions: The AI/ML Release (R-TF-028-006) document provides the software team with precise integration specifications.
  • Semantic Versioning: The algorithm release package is assigned a unique semantic version (e.g., v1.0.0), with full traceability to the versions of its constituent models.

Working Plan​

  1. Verification is performed to confirm the model was developed according to this plan.
  2. Validation is performed to confirm the model meets the acceptance criteria in R-TF-028-001.
  3. The V&V results are documented in the AI/ML Development Report (R-TF-028-005).
  4. The final algorithm package and AI/ML Release (R-TF-028-006) are delivered to the software team.

Deliverables​

Documentation​

  • All R-TF-028-xxx documents generated, including Description, Development Plan, Reports, and completed V&V checklists.

Algorithm Package​

  • 1 ICD Category Distribution algorithm (as .onnx file).
  • 1 Binary Indicators configuration (as .json mapping file).

AI/ML Risk Management Plan​

This plan focuses on risks inherent to the AI/ML development lifecycle, as recorded in R-TF-028-011 AI/ML Risk Matrix. This process is a key input into the overall device risk management activities governed by ISO 14971.

AI/ML Risk Management Process​

  • Risk Assessment: Systematically identifying, analyzing, and evaluating risks related to data, model training, and performance.
  • Risk Control: Implementing and verifying mitigation measures for all unacceptable risks.
  • Monitoring & Review: Continuously reviewing risks throughout the lifecycle.

AI/ML Risk Ranking System​

RPN=Severity×LikelihoodRPN = Severity \times LikelihoodRPN=Severity×Likelihood

Severity​

Severity is based on the potential impact on model performance and its clinical utility.

RankingDefinitionSeverity
5Degrades model performance to a point of being fundamentally flawed or unsafe (e.g., systematically misclassifies critical conditions).Catastrophic
4Significantly degrades model performance, making it frequently unreliable or erroneous for its intended task.Critical
3Moderately degrades model performance, making it often erroneous under specific, plausible conditions.Moderate
2Slightly degrades model performance, making it sometimes erroneous or showing minor performance loss.Minor
1Negligibly degrades model performance with no discernible impact on clinical utility.Negligible

Likelihood​

Likelihood of the risk occurring during development.

RankingDefinitionLikelihood
5Almost certain to occur if not controlled.Very high
4Likely to occur.High
3May occur.Moderate
2Unlikely to occur.Low
1Extremely unlikely to occur.Very low

AI/ML Risk Priority Number and Acceptability​

Severity →<br>Likelihood ↓Negligible (1)Minor (2)Moderate (3)Critical (4)Catastrophic (5)
Very high (5)Tolerable (5)Tolerable (10)Unacceptable (15)Unacceptable (20)Unacceptable (25)
High (4)Acceptable (4)Tolerable (8)Tolerable (12)Unacceptable (16)Unacceptable (20)
Moderate (3)Acceptable (3)Tolerable (6)Tolerable (9)Tolerable (12)Unacceptable (15)
Low (2)Acceptable (2)Acceptable (4)Tolerable (6)Tolerable (8)Tolerable (10)
Very low (1)Acceptable (1)Acceptable (2)Acceptable (3)Acceptable (4)Tolerable (5)
  • Acceptable: RPN ≤ 4
  • Tolerable: 5 ≤ RPN ≤ 12 (Requires risk-benefit analysis)
  • Unacceptable: RPN ≥ 15 (Requires mitigation)

Safety Risks Related to AI/ML​

The AI team is responsible for identifying how AI/ML development risks can contribute to hazardous situations. These "safety risks related to AI/ML" are escalated to the product team for inclusion in the overall Safety Risk Matrix and are mitigated through a combination of technical controls and user-facing measures, in line with ISO 14971.

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

  • Author: Team members involved
  • Reviewer: JD-003, JD-004
  • Approver: JD-001
Previous
R-TF-028-001 AI/ML Description
Next
R-TF-028-003 Data Collection Instructions - Prospective Data
  • Abbreviations
  • Introduction
    • Context
    • Objectives
  • Team
  • Project Management
    • Meetings
    • Management Tools
    • Project Planning
  • Environment
    • Development Tools
    • Development Software
    • Development Environment
  • AI/ML Development Plan
    • Development Cycle
    • Development Specifications
    • Development Steps
  • Data Management Plan
    • Good Practices
      • Data Collection & Curation
      • Data Quality & Integrity
      • Ground Truth Determination
      • Sequestration of Test Data
    • Working Plan
  • Training & Evaluation Plan
    • Good Practices
      • Reproducibility and Traceability
      • Model Design & Selection
      • Model Training & Tuning
      • Model Evaluation & Validation
    • Working Plan
  • Release Plan
    • Good Practices
    • Working Plan
    • Deliverables
      • Documentation
      • Algorithm Package
  • AI/ML Risk Management Plan
    • AI/ML Risk Management Process
    • AI/ML Risk Ranking System
      • Severity
      • Likelihood
      • AI/ML Risk Priority Number and Acceptability
    • Safety Risks Related to AI/ML
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI LABS GROUP S.L.)