Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • TF_Legit.Health_Plus
    • Legit.Health Plus TF index
    • Legit.Health Plus STED
    • Legit.Health Plus description and specifications
    • R-TF-001-007 Declaration of conformity
    • GSPR
    • Clinical
    • Design and development
    • Design History File (DHF)
      • Version 1.1.0.0
        • Requirements
          • REQ_001 The user receives quantifiable data on the intensity of clinical signs
          • REQ_002 The user receives quantifiable data on the count of clinical signs
          • REQ_003 The user receives quantifiable data on the extent of clinical signs
          • REQ_004 The user receives an interpretative distribution representation of possible ICD categories represented in the pixels of the image
          • REQ_005 The user can send requests and get back the output of the device as a response in a secure, efficient and versatile manner
          • REQ_006 The data that users send and receive follows the FHIR healthcare interoperability standard
          • REQ_007 If something does not work, the API returns meaningful information about the error
          • REQ_008 Notify the user if the image does not represent a skin structure
          • REQ_009 Notify the user if the quality of the image is insufficient
          • REQ_010 The device detects if the image is of clinical or dermatoscopic modality
          • REQ_011 The user specifies the body site of the skin structure
          • REQ_012 Users can easily integrate the device into their system
          • REQ_013 The user receives the pixel coordinates of possible ICD categories
          • ignore-this
          • software-design-specification
          • software-requirement-specification
          • user-requirement-specification
        • Test plans
        • Test runs
        • Review meetings
        • 🥣 SOUPs
    • IFU and label
    • Post-Market Surveillance
    • Quality control
    • Risk Management
  • Licenses and accreditations
  • External documentation
  • TF_Legit.Health_Plus
  • Design History File (DHF)
  • Version 1.1.0.0
  • Requirements
  • REQ_003 The user receives quantifiable data on the extent of clinical signs

REQ_003 The user receives quantifiable data on the extent of clinical signs

Category​

Major

Source​

  • Dr. Gastón Roustan, dermatologist at Hospital Puerta de Hierro
  • Dr. Ramon Grimalt, dermatologist at Grimalt Dermatología
  • Dr. Sergio Vaño, dermatologist at Hospital Ramón y Cajal

USER SYSTEM INPUTS AND OUTPUTS DATABASE AND DATA DEFINITION ARCHITECTURE

Activities generated​

  • MDS-100
  • MDS-99
  • MDS-173
  • MDS-393

Causes failure modes​

  • The AI models might misinterpret or miscalculate the extent of the clinical signs in the images, leading to an incorrect assessment of their extent.
  • Poor quality or improperly taken images might result in incorrect analysis and quantification of the extent of clinical signs.
  • Delays or timeouts in processing and delivering the extent of data could affect timely access to accurate information.
  • Incorrect units of measurement for the extent of clinical signs might be used or displayed.
  • The AI might incorrectly identify the boundaries of the clinical signs, leading to an inaccurate assessment of their extent.
  • Variations in lighting, angle, or distance in the images might affect the AI's ability to accurately determine the surface area of clinical signs.

Related risks​

    1. Misrepresentation of magnitude returned by the device
    1. Misinterpretation of data returned by the device
    1. Incorrect clinical information: the care provider receives into their system data that is erroneous
    1. Incorrect clinical information: the care provider receives into their system data that is erroneous
    1. Incorrect results shown to the patient
    1. Sensitivity to image variability: analysis of the same lesion with images taken with deviations in lightning or orientation generates significantly different results
    1. Inaccurate training data: image datasets used in the development of the device are not properly labeled
    1. Biased or incomplete training data: image datasets used in the development of the device are not properly selected
    1. Lack of efficacy or clinical utility
    1. Stagnation of model performance: the AI/ML models of the device do not benefit from the potential improvement in performance that comes from re-training
    1. Degradation of model performance: automatic re-training of models decreases the performance of the device

User Requirement, Software Requirement Specification and Design Requirement​

  • User Requirement 3.1: Users shall receive data quantifying the spatial extent of identified clinical signs.
  • Software Requirement Specification 3.2: Algorithms shall analyze and quantify the spatial distribution and size of clinical signs.
  • Design Requirement 3.3: Data on the extent of clinical signs shall be returned in an FHIR-compliant format, ensuring consistent communication and interoperability.

Description​

Quantifying lesions and assessing their intensity plays a pivotal role in evaluating the severity of skin conditions. However, these metrics may not always suffice. Some visual signs manifest as plaques, which can vary in their extent across the body, directly impacting the overall severity. For instance, a dry patch covering 10% of the body surface area (BSA) may expand to affect 20% BSA, encompassing a larger portion of the skin.

Accurately quantifying the surface area of these visual signs is a complex task, even for seasoned dermatologists. Yet, it is indispensable for assessing the severity of numerous skin pathologies, including atopic dermatitis, psoriasis, and alopecia. Determining the surface area about the entire body is subjective when relying solely on the human eye.

To tackle this challenge, there is an urgent need to develop a suite of algorithms tailored to automate quantifying the surface area of various visual signs, such as:

  • Erythema
  • Induration
  • Desquamation
  • Edema
  • Oozing
  • Excoriation
  • Lichenification
  • Dryness
  • Hair density
  • Wounds
  • Maceration
  • Necrotic tissue
  • Bone and/or adjacent tissue
  • Slough or biofilm
  • External material
  • Granulated tissue

The automation of surface area quantification follows a systematic approach, typically divided into two primary phases, mirroring the standard workflow of data science projects: data annotation and algorithm development.

Data annotation​

The initial phase, data annotation, stands as a critical pillar in comprehending inter-observer variability and lays the groundwork for algorithm training. In this stage, medical professionals meticulously evaluate individual images and delineate the surface area with the aid of a polygon tool. All professionals are given some training before annotation to ensure they understand the task for which data needs to be labelled.

A pivotal consideration in this phase involves the selection of the right medical experts and the determination of an optimal team size. Typically, we engage a minimum of three seasoned physicians for this task. To ensure the precision required for diverse pathologies, we have assembled a trio of specialists for atopic dermatitis and psoriasis, another four for wounds, and an additional trio for alopecia, as each pathology necessitates distinct expertise.

By pooling the assessments of these experts, we establish a ground truth dataset. This dataset serves a dual purpose: it becomes the foundation for training our algorithms and also allows us to gauge inter-observer variability, a critical measure of the performance and consistency of our measurements.

Algorithm development​

The next phase involves the development of the algorithms, which will rely on the ground truth data collected during the previous stage. The outcomes generated by the algorithms will then be juxtaposed with the measured variability. This step is particularly crucial since tasks of this complexity, prone to inherent variability, necessitate comparison with the prevailing baseline or state-of-the-art standards. This comparative analysis is essential for validating the algorithm's performance accurately.

It's worth emphasizing that the convolutional neural networks we are training will assimilate knowledge from the collective expertise of specialists. It's important to acknowledge that a significant subjective element is inherent in this process, given the nuanced nature of the task.

When possible, the dataset used to develop each algorithm is split into training, validation, and test sets. However, when the sample size is limited, the data is split into training and validation only to ensure each set contains enough data.

Success metrics​

The efficacy of automatic quantification of clinical signs, which relies solely on visual data from 2D images, is intrinsically linked to the capability of accurately capturing these signs on camera. By carefully reviewing the evaluation methodologies in state-of-the-art medical solutions [1, 2], and taking into account that some clinical signs are more visually prominent than others, we tailored the choice of metrics and success thresholds for each task to align with its inherent level of difficulty.

We utilize the IoU and the AUC metrics due to their widespread acceptance in diagnostic medicine to provide a quantitative measure for assessing the model performance in medical imaging.

We have determined that an Intersection over Union (IoU) value of 0.5 serves as the minimum benchmark for surpassing expert performance in quantifying hair density, wounds, maceration, necrosis, and bone extent. This threshold is based on the complexity of these tasks and research conducted with expert dermatologists and nurses, supported by scientific evidence from multiple experts. The IoU metric measures the overlap between predicted bounding boxes and ground truth boxes, with scores ranging from 0 to 1.

For tasks that are comparatively less complex, we have adopted a more stringent criterion, setting an Area Under the Curve (AUC) threshold of 0.8. The AUC value ranges from 0 to 1, where an AUC of 1 represents a perfect model that can perfectly distinguish between the classes, and an AUC of 0.5 indicates a model with no predictive power (equivalent to random classification). Typically, AUC values above 0.8 are considered good, while values below 0.7 are generally unsatisfactory [3]. This threshold ensures a higher level of accuracy and reliability in our model's predictions.

GoalMetric
The automated quantification of erythema, induration and desquamation surface area achieves an expert consensus level of performanceArea Under the Curve (AUC) greater than 0.8
The automated quantification of erythema, edema, oozing, excoriation, lichenification and dryness surface area achieves an expert consensus level of performanceArea Under the Curve (AUC) greater than 0.8
The automated quantification of hair density achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.5
The automated quantification of wounds achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.70
The automated quantification of necrotic tissue achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.50
The automated quantification of bone and/or adjacent tissues achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.30
The automated quantification of slough or biofilm achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.30
The automated quantification of external material achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.15
The automated quantification of maceration achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.15
The automated quantification of granulated tissue achieves an expert consensus level of performanceIntersection over Union (IoU) greater than 0.15

[1] Hasan, M. K., Ahamad, M. A., Yap, C. H., & Yang, G. (2023). A survey, review, and future trends of skin lesion segmentation and classification. Computers in Biology and Medicine, 106624.

[2] Mirikharaji, Z., Abhishek, K., Bissoto, A., Barata, C., Avila, S., Valle, E., ... & Hamarneh, G. (2023). A survey on deep learning for skin lesion segmentation. Medical Image Analysis, 102863.

[3] Müller, D., Soto-Rey, I. & Kramer, F. Towards a guideline for evaluation metrics in medical image segmentation. BMC Res Notes 15, 210 (2022). https://doi.org/10.1186/s13104-022-06096-y

[4] White, N., Parsons, R., Collins, G. et al. Evidence of questionable research practices in clinical prediction models. BMC Med 21, 339 (2023). https://doi.org/10.1186/s12916-023-03048-6

Model bias​

This topic is addressed in REQ_001 and REQ_002.

Model robustness​

This topic is addressed in REQ_001 and REQ_002.

Previous related requirements​

  • REQ_001
  • REQ_002

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

  • Author: JD-004, JD-005, JD-009, JD-017
  • Approver: JD-003
Previous
REQ_002 The user receives quantifiable data on the count of clinical signs
Next
REQ_004 The user receives an interpretative distribution representation of possible ICD categories represented in the pixels of the image
  • Category
  • Source
  • Activities generated
  • Causes failure modes
  • Related risks
  • User Requirement, Software Requirement Specification and Design Requirement
  • Description
    • Data annotation
    • Algorithm development
  • Success metrics
    • Model bias
    • Model robustness
  • Previous related requirements
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI LABS GROUP S.L.)