R-015-005 Investigator's Brochure Legit.Health_acne

Table of contents

Introduction
Identification of the Investigator's Brochure (IB)
Sponsor/Manufacturer
Investigational device information
Preclinical testing
Existing clinical performance data
- Summary of relevant previous clinical experience with the investigational device: (Information on clinical data generated by the manufacturer)
- Analysis of adverse device effects and any history of modification or recall
Investigational device risk management

Introduction

This Investigator's Brochure (IB) provides essential information regarding a medical device to support investigators in understanding its characteristics, intended use, and clinical application.

The purpose of this document is to compile relevant preclinical and clinical data, safety information, and regulatory considerations related to the device.

The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analysed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.

Identification of the Investigator's Brochure (IB)

Title of the clinical investigation

Pilot study for the clinical validation of a medical device for the automatic severity assessment and remote monitoring of patients with acne.

Investigational device

Legit.Health Plus (hereinafter, the device)

IB Reference Number

Legit.Health_acne Investigator's Brochure

Protocol code

Legit.Health_acne_V6

Confidentiality statement

This Study Investigator's Brochure is the property of the manufacturer. and is a confidential document. It must not be copied or distributed to other parties without prior written authorisation from the manufacturer.

Principal Investigator

José Luis López Estebaranz
Clínica DermoMedic
C. de Jorge Juan, 36
28001 Madrid
Tel: 915769054

Sponsor/Manufacturer

	Manufacturer data
Legal manufacturer name	AI Labs Group S.L.
Address	Street Gran Vía 1, BAT Tower, 48001, Bilbao, Bizkaia (Spain)
SRN	ES-MF-000025345
Person responsible for regulatory compliance	Alfonso Medela, Saray Ugidos
E-mail	office@legit.health
Phone	+34 638127476
Trademark	Legit.Health

Investigational device information

Summary of the literature and evaluation supporting the rationale for the design and intended use of the investigational device

The existing literature on acne and severity assessment highlights the need for accurate diagnostic tools, given the complexity of its presentation and the variability in its severity and impact. Acne is a common skin disorder caused by the dysfunction of sebaceous glands, leading to clogged pores and various types of lesions in areas rich in sebaceous glands, such as the face and forehead. Affecting up to 80% of adolescents, acne has significant psychological repercussions, often contributing to low self-esteem, anxiety, and depression, which in turn can lower the quality of life to levels comparable to chronic illnesses.

Current tools for assessing acne severity, such as lesion counting and the Investigator Global Assessment (IGA), face challenges related to subjectivity and time demands. For instance, while lesion counting is precise, it is labor-intensive and impractical for routine clinical use. Other methods, like the Global Acne Grading System (GAGS), are detailed but complex, making it difficult to ensure consistency both between observers and within the same observer. There is no universally standardised and widely accepted scoring method, leading to inconsistencies in clinical evaluations and limited comparability between studies.

Recent advances in image processing and AI-based systems have introduced automated tools that address these limitations. Studies by Chantharaphaichi et al. and Maroni et al. have shown promising results in automated acne detection and assessment. In this context, the manufacturer has developed ALADIN, an AI-powered system integrated into the device. Retrospective validation of ALADIN has demonstrated its effectiveness. The goal of this study is to clinically validate ALADIN, potentially offering a standardised, time-efficient, and reliable solution for assessing acne severity. This advancement could improve treatment timelines and outcomes for patients while promoting consistency in clinical practice. Additionally, it paves the way for future research on treatment efficacy and deeper insights into the pathophysiology of acne.

References

Williams HC, Dellavalle RP, Garner S. Acne vulgaris. Lancet. 2012 Jan 28;379(9813):361-72. doi: 10.1016/S0140-6736(11)60321-8.
Kircik LH. Androgens and acne: perspectives on clascoterone, the first topical androgen receptor antagonist. Expert Opin Pharmacother. 2021 Sep;22(13):1801-1806. doi: 10.1080/14656566.2021.1918100.
Tan JK, Bhate K. A global perspective on the epidemiology of acne. Br J Dermatol. 2015 Jul;172 Suppl 1:3-12. doi: 10.1111/bjd.13462.
Samuels DV, Rosenthal R, Lin R, Chaudhari S, Natsuaki MN. Acne vulgaris and risk of depression and anxiety: A meta-analytic review. J Am Acad Dermatol. 2020 Aug;83(2):532-541. doi: 10.1016/j.jaad.2020.02.040.
Agnew T, Furber G, Leach M, Segal L. A Comprehensive Critique and Review of Published Measures of Acne Severity. J Clin Aesthet Dermatol. 2016 Jul;9(7):40-52.
Chantharaphaichi, T., Uyyanonvara, B., Sinthanayothin, C., & Nishihara, A. (2015). Automatic acne detection for medical treatment. 2015 6th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES), 1-6.

Statement concerning the regulatory classification of the investigational device

info

Currently, the device is undergoing the accreditation process to obtain the European CE marking. Once this is achieved, the attached declaration included in this document will be signed. For now, it serves as an example of a declaration of conformity with national regulations.

The device complies with the applicable standards:

UNE-EN ISO 13485:2018 (EN ISO 13485:2016) Medical devices - Quality management systems - Requirements for regulatory purposes
UNE-EN 62304:2007/A1:2016 (EN 62304:2006/A1:2015) Medical device software - Software life-cycle processes
UNE-EN ISO 14971:2020 (EN ISO 14971:2019) Medical devices - Application of risk management to medical devices
UNE-EN ISO 15223-1:2022 (EN ISO 15223-1:2021) Medical devices - Symbols to be used with information to be supplied by the manufacturer - Part 1: General requirements
UNE-EN ISO 20417:2021 (EN ISO 20417:2021) Medical devices - Information to be supplied by the manufacturer
UNE-EN 62366-1:2015/A1:2020 (EN 62366-1:2015/A1:2020) Medical devices - Part 1: Application of usability engineering to medical devices
UNE-EN ISO 14155:2021 (EN ISO 14155:2020) Clinical investigation of medical devices for human subjects - Good clinical practice

There are no common specifications applicable to the device.

Complies with the provisions of the Regulation (EU) 2017/745 of the European Parliament and of the Council on Medical Devices and issued under the exclusive responsibility of AI Labs Group SL.

Classification: Class IIb (Rule 11)

The conformity assessment route is based on a quality management system and on assessment of technical documentation according to the Annex IX (Chapters I and III) of the above mentioned regulation.

Certificate ID: MDR 792790
Notified body: BSI (British Standards Institution) number 2797.

All documentation supporting this CE Declaration of Conformity is preserved in the document management system of the manufacturer, supported by the Quality System approval to ISO 13485 by BSI.

General description of the investigational device

The device is a medical product that operates solely as computational software, utilising advanced computer vision algorithms to analyse images of the epidermis, dermis, and associated skin structures. Its primary function is to generate a comprehensive range of clinical data from the analysed images, supporting healthcare professionals in their clinical assessments and enabling healthcare provider organisations to gather data and optimise their workflows.

The data produced by the device is intended to assist both healthcare professionals and organizations in their clinical decision-making processes, enhancing the efficiency and accuracy of care delivery.

It is important to note that the device is not designed to confirm a clinical diagnosis. Instead, its output serves as one component of a broader clinical evaluation. The device is specifically intended to be used when a healthcare professional seeks additional information to support their decision-making process.

As a software medical device, the manufacturing process does not involve traditional physical production but follows a structured software development lifecycle (SDLC) aligned with regulatory and quality standards, including ISO 13485:2016, IEC 62304:2006/A1:2015, and ISO 14971:2019. The applicable legislation includes:

Medical Device Regulation (MDR) 2017/745
UNE-EN ISO 13485:2018 (EN ISO 13485:2016) Medical devices. Quality management systems. Requirements for regulatory purposes.
UNE-EN 62304:2007/A1:2016 (EN 62304:2006/A1:2015) Medical device software. Software life cycle processes.
UNE-EN ISO 14971:2020 (EN ISO 14971:2019) Medical devices. Application of risk management to medical devices.
UNE-EN ISO 15223-1:2022 (EN ISO 15223-1:2021) Medical devices. Symbols to be used with information to be supplied by the manufacturer. Part 1: General requirements.
UNE-EN ISO 20417:2021 (EN ISO 20417:2021) Medical devices. Information to be supplied by the manufacturer.
UNE-EN 62366-1:2015/A1:2020 (EN 62366-1:2015/A1:2020) Medical devices. Part 1: Application of usability engineering to medical devices.

The development and maintenance of the software follow agile methodologies, ensuring continuous integration, verification, and validation. The process includes:

Requirement analysis and specification: Defined according to clinical needs and risk management principles.
Software design and architecture: Developed following modular and scalable design principles.
Implementation and coding: Conducted in a controlled environment with version control and secure coding practices.
Software verification and validation: Conducted through unit testing, integration testing, system validation, and clinical performance evaluation.
Deployment and release: Includes controlled distribution through validated channels, ensuring integrity and cybersecurity compliance.

The validation processes ensure that the device meets its intended purpose, complies with MDR (Regulation (EU) 2017/745), and follows good software engineering practices. Key validation activities include:

Software Verification and Testing
Clinical Validation
Risk Management and Usability Validation
Regulatory Compliance and Documentation

The manufacturing process adheres to rigorous software development and validation standards, ensuring safety, accuracy, and reliability. Continuous post-market surveillance and performance evaluation further ensure compliance with regulatory requirements and ongoing clinical validation.

Manufacturer's instructions for installation (installation maintenance, storage requirements, manipulation)

Due to its software nature, there is no maintenance required.

Regarding its Lifetime, the expected operational lifetime of the device is established at 5 years, which is subject to regular software updates and the lifecycle of the integrated components and platforms. The lifetime will be increase in equivalent spans as the design and development continues and maintenance and re-design activities are carried out.

This timeline accounts for the expected evolution of the underlying operating systems and tools, the progression of medical device technology, and the necessary update cycles to maintain security and operability.

As regards the installation of the device, this is mostly established in the Instructions for use. Although the device offers various integration methods tailored to the specific use case and client requirements. For this study, researchers will access the device through its web app, ensuring a seamless and straightforward experience. Since the platform is web-based, researchers can log in from any browser, whether on a mobile device or a computer.

Each researcher will receive a unique set of credentials: a username and a corresponding password—allowing them to securely access their professional account on the device web app. Once logged in, they will be able to create patient profiles, upload lesion photographs, and input any relevant clinical information as needed.

For this study, no additional installation or complex setup will be required to use the device.

Sample of the label (for example sticker or copy, and instructions for use or reference to, and information on any training required).

Device name: Legit.Health Plus
European Medical Device Nomenclature (EMDN) coding: Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software)
Global Medical Device Nomenclature (GMDN) coding: 65975
Risk Classification according to EU MDR 2017/745: Class IIb

Symbol	Meaning	Information
	Unique Device Identifier	(01)8437025550005(10)1.0.0.0(11)YYYYMMDD
	Version	(10) 1.0.0.0
	Manufacture date	(11) (YYYYMMDD)
	Manufacturer	AI Labs Group SL BAT Tower, Gran Vía 1, 48001, Bilbao, Biscay (Spain)

Symbol	Meaning
eIFU	Consult electronic instructions for use
	Caution
DRAFT	EU MDR 2017/745 CE marking (DRAFT)
	Medical Device

In case of observing an incorrect operation

In case of observing an incorrect operation of the software, notify the manufacturer as soon as possible at: support@ legit.health. The manufacturer will proceed accordingly. Any serious incident related to the device must be reported both to the manufacturer and the competent authority in the Member State where the user or patient is located.

Undesirable side-effects

It is not known or foreseen any undesirable side-effects specifically related to the use of the software.

Description of the intended clinical performance

Intended use

an interpretative distribution representation of possible International Classification of Diseases (ICD) categories that might be represented in the pixels content of the image
quantifiable data on the intensity, count and extent of clinical signs such as erythema, desquamation, and induration, among others

Quantification of intensity, count and extent of visible clinical signs

The device provides quantifiable data on the intensity, count and extent of clinical signs such as erythema, desquamation, and induration, among others; including, but not limited to:

erythema,
desquamation,
induration,
crusting,
xerosis (dryness),
swelling (oedema),
oozing,
excoriation,
lichenification,
exudation,
wound depth,
wound border,
undermining,
hair loss,
necrotic tissue,
granulation tissue,
epithelialization,
nodule,
papule
pustule,
cyst,
comedone,
abscess,
draining tunnel,
inflammatory lesion,
exposed wound, bone and/or adjacent tissues,
slough or biofilm,
maceration,
external material over the lesion,
hypopigmentation or depigmentation,
hyperpigmentation,
scar

Image-based recognition of visible ICD categories

The device is intended to provide an interpretative distribution representation of possible International Classification of Diseases (ICD) categories that might be represented in the pixels content of the image.

Device description

The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.

The generated data is intended to aid healthcare practitioners and organizations in their clinical decision-making process, thus enhancing the efficiency and accuracy of care delivery.

The device should never be used to confirm a clinical diagnosis. On the contrary, its result is one element of the overall clinical assessment. Indeed, the device is designed to be used when a healthcare practitioner chooses to obtain additional information to consider a decision.

Intended medical indication

The device is indicated for use on images of visible skin structure abnormalities to support the assessment of all diseases of the skin incorporating conditions affecting the epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis).

Intended patient population

The device is intended for use on images of skin from patients presenting visible skin structure abnormalities, across all age groups, skin types, and demographics.

Intended user

The medical device is intended for use by healthcare providers to aid in the assessment of skin structures.

User qualifications and competencies

This section outlines the qualifications and competencies required for users of the device to ensure its safe and effective use. It is assumed that all users already possess the baseline qualifications and competencies associated with their respective professional roles.

Healthcare professionals

No additional official qualifications are required for healthcare professionals (HCPs) to use the device. However, it is recommended that HCPs possess the following competencies to optimize device utilization:

Proficiency in capturing high-quality clinical images using smartphones or equivalent digital devices.
Basic understanding of the clinical context in which the device is applied.
Familiarity with interpreting digital health data as part of the clinical decision-making process.

The device may be used by any healthcare professional who, by virtue of their academic degree, professional license, or recognized qualification, is authorized to provide healthcare services. This includes, but is not limited to:

Medical Doctors (MD, MBBS, DO, Dr. med., or equivalent)
Registered Nurses (RN, BScN, MScN, Dipl. Pflegefachfrau/-mann, or equivalent)
Nurse Practitioners (NP, Advanced Nurse Practitioner, or equivalent)
Physician Assistants (PA, or equivalent roles such as Physician Associate in the UK/EU)
Dermatologists (board-certified, Facharzt für Dermatologie, or equivalent)
Other licensed or registered healthcare professionals as recognized by local, national, or European regulatory authorities

Each HCP must hold the academic title, degree, or professional registration that confers their status as a healthcare professional in their jurisdiction, whether in the United States, Europe, or other regions where the device is provided.

IT professionals

IT professionals are responsible for the technical integration, configuration, and maintenance of the medical device within the healthcare organization's information systems.

No specific official qualifications are mandated. Nevertheless, it is advisable that IT professionals involved in the deployment and support of the device have the following competencies:

Foundational knowledge of the HL7 FHIR (Fast Healthcare Interoperability Resources) standard and its application in healthcare data exchange.
Ability to interpret and manage the device's data outputs, including integration with electronic health record (EHR) systems.
Understanding of healthcare data privacy and security requirements relevant to medical device integration, including GDPR (Europe), HIPAA (US), and other applicable local regulations.
Experience with troubleshooting and supporting clinical software in a healthcare environment.
Familiarity with IT standards and best practices for healthcare, such as ISO/IEC 27001 (Information Security Management) and ISO 27799 (Health Informatics—Information Security Management in Health).

IT professionals may include, but are not limited to:

Health Informatics Specialists (MSc Health Informatics, or equivalent)
Clinical IT System Administrators
Healthcare Integration Engineers
IT Managers and Project Managers in healthcare settings
Software Engineers and Developers specializing in healthcare IT
Other IT professionals with relevant experience in healthcare environments, as recognized by local, national, or European authorities

Each IT professional should possess the relevant academic degree, professional certification, or demonstrable experience that qualifies them for their role in the healthcare organization, in accordance with the requirements of the United States, Europe, or other regions where the device is provided.

Use environment

The device is intended to be used in the setting of healthcare organisations and their IT departments, which commonly are situated inside hospitals or other clinical facilities.

The device is intended to be integrated into the healthcare organisation's system by IT professionals.

Operating principle

The device is computational medical tool leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures.

Body structures

The device is intended to use on the epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis).

In fact, the device is intended to use on visible skin structures. As such, it can only quantify clinical signs that are visible, and distribute the probabilities across ICD categories that are visible.

Explainability

For visual signs that can be quantified in terms of count and extent, the underlying models not only calculate a final value, such as the number of lesions, but also determine their locations within the image. Consequently, the output for these visual signs is accompanied by additional data, which varies depending on whether the quantification involves count or extent.

Count. When a visual sign is quantifyed by counting, the device generates bounding boxes for each detected entity. These bounding boxes are defined by their x and y coordinates, as well as their height and width in pixels.
Extent. When a visual sign is quantifyed by its extent, the device outputs a mask. This mask, which is the same size as the image, consists of 0's for pixels where the visual sign is absent and 1's for pixels where it is present.

The explainability output can be found with the explainabilityMedia key. Here is an example:

{
  "explainabilityMedia": {
    "explainabilityMedia": {
      "content": "base 64 image",
      "detections": [
        {
          "confidence": 98,
          "label": "nodule",
          "p1": {
            "x": 202,
            "y": 101
          },
          "p2": {
            "x": 252,
            "y": 154
          }
        },
        {
          "confidence": 92,
          "label": "pustule",
          "p1": {
            "x": 130,
            "y": 194
          },
          "p2": {
            "x": 179,
            "y": 245
          }
        }
      ]
    }
  }
}

Preclinical testing

Design calculations

The user receives quantifiable data on the intensity of clinical signs

Objective

This study aims to demonstrate that the quantification of clinical sign intensity provided by the algorithm performs at the level of expert dermatologists.

Acceptance Criteria

The algorithm must achieve a Relative Mean Absolute Error (RMAE) below 20%.

Materials & Methods

A dataset of 5,459 images depicting dermatological conditions was analysed by multiple dermatology experts to establish a gold standard. Each image was labeled with ordinal or categorical values, and the algorithm was trained based on the consensus among experts.

To ensure the dataset was sufficiently large, RMAE values were calculated and monitored. A stabilisation threshold of 0.02 in standard deviation was used to determine when additional data no longer significantly impacted performance. The final dataset was split into training, validation, and test sets using K-fold cross-validation. Expert agreement levels were assessed, yielding an RMAE of 13.8% and a Relative Standard Deviation (RSD) of 12.56%, confirming the dataset's adequacy.

Model Training & Evaluation

A set of multi-output deep learning classifiers was developed, each focused on a specific task. The models were based on EfficientNet-B0, a neural network architecture pre-trained on ImageNet. Training followed a transfer learning approach, initially freezing most layers and fine-tuning the entire model in later stages.

Tasks included estimating the intensity of various visual signs such as erythema, edema, desquamation, exudation and affected tissues. The models were evaluated using RMAE and balanced accuracy metrics.

Results of the test

The algorithms achieved strong performance, with an average RMAE of 13% for key clinical signs. All individual RMAE values remained below the 20% threshold, meeting the acceptance criteria. Performance highlights include:

Erythema, edema, oozing, excoriation, lichenification, and dryness: RMAE of 13%.
Induration, desquamation, and pustulation: RMAE consistently below 20%.
Exudation, edges, and affected tissues: Balanced accuracy of 64%, 74%, and 69%, respectively.

Conclusions

The automatic quantification of clinical sign intensity by the algorithm is comparable to expert dermatologists, ensuring high-quality, reliable data to support clinical decision-making.

The user receives quantifiable data on the count of clinical signs

Objective

To assess whether the quantifiable data on the count of clinical signs provided to the user is extracted with expert dermatologist-level performance.

Acceptance Criteria

The algorithm's Mean Absolute Error (MAE) for detecting nodules, abscesses, and draining tunnels must be lower than that of the annotators or within a variance of less than 10%.

Hive detection must achieve precision and recall rates above 50%.

Inflammatory lesion detection must achieve precision and recall rates above 70%.

Materials & Methods

Ground Truth Generation

A total of 2,012 images were used, categorised as follows: hidradenitis suppurativa (221), acne (1,457), and urticaria (334). Expert dermatologists specialising in each condition reviewed the images (six for hidradenitis and five for urticaria). The ACNE04 dataset was pre-labeled, so no additional annotators were needed.

For hidradenitis suppurativa, a four-stage aggregation algorithm was developed, slightly favoring the most experienced and best-performing specialists to unify multi-label annotations. For urticaria, a different method was applied: individual bounding boxes were converted into Gaussian distributions and merged based on annotator consensus. These methods ensured accurate ground truth labels for training object detection models.

Data Splitting

After ground truth labels were generated, the datasets were split into training and validation sets. Given the limited dataset size, a train/validation split was chosen over a train/validation/test setup, with K-fold cross-validation applied to improve reliability. The ACNE04 dataset followed the original patient-wise stratified split, while hidradenitis and urticaria images were manually reviewed to prevent data leakage.

Model Training

Three main object detection tasks were defined, each trained with the YOLOv5 architecture:

A model for detecting nodules, abscesses, and draining tunnels.
A model for detecting hives.
A model for detecting inflammatory lesions.

YOLOv5 was selected due to its strong balance between speed and accuracy. Transfer learning was applied, using pre-trained weights from ImageNet to enhance dermatology-related tasks. Once trained, these models can automatically detect and count lesions to compute severity scores (e.g., UAS and IHS4), categorising cases as “clear,” “mild,” “moderate,” or “severe.”

Results of the test

Nodule, Abscess, and Draining Tunnel Detection

The YOLOv5x model achieved MAE values of 2.16 (mild), 3.37 (moderate), and 5.26 (severe), closely matching dermatologist scores of 2.04, 3.01, and 4.88, respectively. The algorithm's variance (0.096) was significantly lower than the annotators' (1.57), well below the 10% deviation threshold.

Inflammatory Lesion Detection

The algorithm's MAE was 5.56, lower than the dermatologists' 7.5. Precision and recall exceeded the 70% threshold, with the best-performing model (YOLOv5m) achieving 73.5% precision and 74.17% recall.

Hive Detection

The best-performing model achieved an average precision of 68% and a recall of 57%. Mean Average Precision at IoU 0.5 (mAP@0.5) exceeded 0.60, confirming strong performance despite the task's complexity.

Protocol Deviations

No additional annotation was required for acne images.

Conclusions

The quantifiable data on clinical sign counts provided to users matches dermatologist expertise. This ensures the quality and consistency of training datasets, offering healthcare professionals reliable information to support clinical assessments.

The user receives quantifiable data on the extent of clinical signs

Objective

To assess whether the quantifiable data on the surface area of clinical signs provided to the user is extracted with expert dermatologist-level performance.

Acceptance Criteria

Area Under the Curve (AUC) greater than 0.80
Intersection over Union (IoU) greater than 0.5

Materials & Methods

Ground Truth Generation

A total of 4,596 images were utilised, categorised as follows:

Atopic dermatitis and similar eczemas: 1,083 images
Psoriasis: 1,376 images
Scalp conditions (with or without hair loss): 1,826 images
Pressure ulcers: 311 images

Each dataset was annotated by specialised healthcare professionals: three dermatologists for psoriasis, three for atopic dermatitis, three for hair loss, and four expert nurses for pressure ulcers. Ground truth labels were established using mean and median pixel-wise statistics.

To evaluate dataset adequacy, a complexity assessment was conducted based on medical evidence and a variability study with experienced physicians. Annotators achieved IoU values exceeding 0.9, and for darker skin tones, values remained above 0.8. This strong agreement confirmed that the datasets were robust and suitable for training segmentation models.

Data on the extent of clinical signs outcomes

Atopic Dermatitis and Eczema (ASCORAD scoring system)

The metrics obtained in this test are as follows:

IoU values for erythema, edema, oozing, excoriation, lichenification, and dryness: 0.93 (exceeding the 0.8 requirement).
AUC for erythema, induration, and desquamation: 0.95 (excellent performance).
Hair Density Quantification IoU: 0.71, based on consensus from three annotators.
Pressure Ulcers
- IoU values:
- Wound: 0.76
- Maceration: 0.60
- Necrotic tissue: 0.57
- Bone exposure: 0.79

All values were measured considering the consensus of expert annotations.

Protocol Deviations

No deviations from the initial protocol.

Conclusions

The quantifiable data on clinical sign surface area provided to users matches expert healthcare professional performance. This ensures high-quality training datasets with sufficient size and consistency, enabling healthcare practitioners to rely on precise and standardised information for clinical assessments.

The user receives an interpretative distribution representation of possible ICD categories represented in the pixels of the image

Objective

To verify that the algorithm accurately generates a distribution representation of possible ICD categories, ensuring that the extracted ICD-11 category features from the image pixels align with ground truth classifications.

The sum of confidence values must equal 100%.

Minimum requirements:

Top 1: 55%
Top 3: 70%
Top 5: 80%
AUC value must be at least 0.80.

Materials & Methods

Dataset

A dataset of 181,591 RGB images (average size: 756x1048 pixels) was compiled from multiple dermatology and skin-related datasets, ensuring diversity in age, sex, and skin tone representation. Key sources include:

Dermatology datasets: ACNE04, ASAN, Danderm, Derm7pt, ISIC, PAD-UFES-20, PH2, Severance, etc.

Skin structure datasets: 11kHands, Figaro1k, FUSeg, HGR.

Metadata on Fitzpatrick skin type (16,012 images):

Types 1-2: 48.43%

Types 3-4: 38.03%

Types 5-6: 13.54%

Additional metadata on sex (2,338 images) and age (3,614 images) confirms dataset diversity, though not evenly distributed due to natural population incidence rates of certain conditions.

Data Stratification & Augmentation

Patient-wise stratification where possible, class-wise split otherwise.

Specific augmentations applied:

Dermoscopic images: Rotation augmentations to ensure invariance.

All images: Pixel transformations (color jittering, histogram equalisation, noise, blur, etc.) to reduce bias.

No augmentations were applied to validation and test data.

Model Architecture & Training

A Vision Transformer (ViT) was implemented with a one-cycle learning rate policy for super-convergence, optimising training efficiency.

ICD categories in clinical images findings

Confidence distribution: The softmax layer ensures the sum of all values equals 100%.

Performance metrics (Test Set):

Top 1: 74% (+19% above threshold)

Top 3: 86% (+16% above threshold)

Top 5: 90% (+10% above threshold)

AUC (Malignancy Quantification): 0.96

AUC (Dermatological Condition Indicator): 0.99

AUC (Urgent Referral Indicator): 0.97

AUC (High-Priority Referral Indicator): 0.93

Bias Analysis

Evaluated on the Diverse Dermatology Images (DDI) dataset, following methods from Daneshjou et al.

Initial results showed performance drops for darker skin tones. However, after manually cropping the 656 images for better framing, AUC improved across all groups:

Fitzpatrick 1-2: 0.6255 → 0.6903

Fitzpatrick 3-4: 0.6537 → 0.8365

Fitzpatrick 5-6: 0.6417 → 0.6724

Overall: 0.6510 → 0.7627

Implemented GradCAM for model interpretability, identifying focus areas in classifications and detecting potential biases.

Protocol Deviations

No deviations from the initial protocol.

Conclusions

The algorithm successfully generates a high-performance distribution representation of possible ICD categories, ensuring healthcare professionals receive accurate and reliable diagnostic support. Continued dataset enhancement and bias mitigation strategies will further refine its effectiveness.

Validation of software relating to the function of the device

Validation and verification of the device have been carried out in accordance with the requirements established in the IEC 62304 standard. These activities have included the exhaustive review of software requirements, the implementation of unit, integrated and system tests, as well as the final validation in conditions representative of real use. The results obtained demonstrate that the software meets the defined functional, performance and security requirements, ensuring its reliability and compliance with current regulations. Likewise, all activities carried out as part of the validation and verification process have been documented, guaranteeing complete traceability from the initial requirements to the results obtained.

Performance tests

Test: If something does not work, the API returns meaningful information about the error

Tests carried out at the medical device's receiver module, to verify that when errors occur due to improperly formatted or invalid data in the request body, the API provides clear and meaningful error information. The goal of these tests is to enhance the user experience by offering transparent guidance when things go awry.

These error messages are designed to be both descriptive and actionable. They not only convey the nature of the problem but also offer insights into the specific location within the request where the issue lies. This empowers developers and integrators to quickly pinpoint and rectify the root cause of the problem. By delivering informative error messages, we aim to expedite the troubleshooting process, reducing the time required to identify and address issues.

Moreover, these tests assess the compliance of error codes and status responses with industry standards, ensuring consistency and predictability in how errors are handled.

Test: Notify the user image modality and if the image does not represent a skin structure

Objective

Evaluate the domain detection algorithm's ability to identify skin structures and image modality.

Acceptance Criteria

Achieve an AUC greater than 0.8.

Materials & Methods

Dataset: Over 100,000 dermatological images from ICD-11 categories, alongside non-dermatology datasets (ImageNet, MS-COCO, Cartoon Set, Textures) to ensure robustness against irrelevant images.
Model: Vision Transformer (ViT Small), trained using transfer learning.
Training Approach:
Initial training with frozen layers (except the final layer).
Full model fine-tuning after initial training.
Validation: Model predictions were converted to binary (valid vs. invalid images) while maintaining multi-class classification for additional detail.

Validation of device performance

The model achieved an outstanding AUC of 0.9957 on the validation set. Protocol Deviations: None.

Conclusions

The algorithm demonstrates excellent performance in detecting skin structures and image modality, ensuring reliable image domain classification.

Test: Notify the user if the quality of the image is insufficient

Objective

Evaluate the algorithm's ability to detect poor-quality images and notify users accordingly.

Acceptance Criteria

Achieve a linear correlation (LC) greater than 0.7.

Materials & Methods

Dataset

934 dermatological images (clinical and dermoscopic) from various sources (smartphones, cameras, dermoscopes). Images rated by 40 non-expert observers following ITU-T P.910 guidelines, assigning a Mean Opinion Score (MOS) on a [1-10] scale. Additional datasets (KonIQ-10k, SPAQ, Kadid-10k) provided diverse distortions for model robustness. Annotation Protocol: Large observer groups ensured accurate MOS values, with unreliable annotators excluded based on deviation from the mean. Model Training & Validation: EfficientNet-B0 convolutional neural network trained using transfer learning. Model performance assessed using Mean Absolute Error (MAE) and Pearson’s Linear Correlation Coefficient (LCC).

Notification of insufficient image quality

Best model achieved 0.80 LC on the test set. Performance varied based on data type: General domain images (real distortions) + dermatological fine-tuning: 0.806 LC (best result). General + dermatological images without fine-tuning: 0.737 LC.

Protocol Deviations

None.

Conclusions

The algorithm effectively detects low-quality images, ensuring reliable quality assessment and user notifications.

Test: The user specifies the body site of the skin structure

Objective

The objective is to enable users to submit information about the body site of the skin structure, thereby offering supplementary and valuable data for the clinical algorithm's analysis.

Acceptance criteria

Medical device accepts body sites in the request.

Material & methods

First, we established a taxonomy for encoding within the medical device. Subsequently, we devised a method that adheres to the FHIR standard, ensuring user-friendly interaction with the device. Users can incorporate the body site information under the bodySite key within the payload.

Specification of body site of the skin structure. Results

The receiver accurately handles body site data and transmits it to the orchestrator.

Protocol deviations

There were no deviations from initial protocol.

Conclusions

Users can interact with the device by submitting the body site where the lesion is located.

Test: We facilitate the integration of the device into the users' system

Tests carried out at the users, to verify the effectiveness of the device documentation in facilitating system Integration.

The objective of our testing is to determine if the provided documentation effectively assists users in achieving their goal of integrating the device and addressing any potential issues that may arise during the process. This process is clearly outlined in R-TF-012-007 Software usability plan_2023_001.

The results conclusively demonstrate that the documentation successfully accomplishes its intended purpose.

Test: The data that users send and receive follows the FHIR healthcare interoperability standard

Test carried out at the user interface level, to verify that the data exchanged between users and the device adheres to the FHIR (Fast Healthcare Interoperability Resources) standard. The objective of this test is to ensure that all patient images and associated information, as well as preliminary reports generated by the device, are formatted and transmitted in compliance with the FHIR standard, validating the device's ability to seamlessly integrate with other healthcare systems and electronic health records. By confirming this, we guarantee that the device can efficiently exchange crucial patient data with other medical tools and systems, promoting data consistency and facilitating secure and standardised, structured data sharing and interoperability within the broader healthcare ecosystem.

Test: The user authentication feature is functioning correctlyç

This execution batch includes all test cases related to user authentication, such as managing access tokens (generating tokens from valid user credentials and handling token expiration), and temporarily locking user accounts after multiple failed login attempts.

These test cases are being run for the first time on this version of the medical device.

Test: Ensure all API communications are conducted over HTTPS

This test run aims to validate the HTTP to HTTPS redirection functionality and ensure the use of valid SSL/TLS certificates for secure communication. The focus is on confirming that all HTTP requests are properly redirected to HTTPS without errors or security warnings and that the SSL/TLS certificates used are valid, not expired, and issued by a trusted Certificate Authority (CA). The API root endpoint will be tested for consistent redirection behavior. Additionally, the SSL/TLS certificates will be checked for validity, proper configuration, and certificate chain. The tests will be conducted in a staging environment mirroring the production setup, with any issues documented and recommendations provided.

These test cases are being run for the first time on this version of the medical device.

Test: Ensure API compliance with Base64 image format and FHIR standard

This test run record documents the results of executing test plans designed to verify two main areas: confirming that the API only accepts images in Base64 format and validating that the data schemas of the request and response content adhere to the FHIR (Fast Healthcare Interoperability Resources) standard. The record includes detailed outcomes of the tests, highlighting any deviations from expected behavior, along with corresponding error messages and potential impacts on system functionality.

These test cases are being conducted for the first time on this version of the medical device.

Test: Verification of authorised user registration and body zone specification in device API

In this test run, we will focus on two primary verification processes. First, we'll ensure that only authorised users can register for the device API. This involves confirming that the registration process occurs within a non-publicly accessible, restricted environment, preventing unauthorised users from accessing the registration.

Second, we will verify that requests involving certain scoring systems correctly specify the body zone affected by the skin lesion. This step is crucial to ensure that the scoring systems receive the necessary information to accurately calculate the severity or impact of the skin lesion based on the specified body zone.

Test: Ensure API stability and cybersecurity of the medical device

This test run is planned to ensure our medical device API is both reliable and secure. It focuses on two main objectives. First, it aims to verify that the API is available 99% of the time by continuously monitoring the API's performance and uptime over a one-month period. We'll check for any instances of downtime or interruptions, noting how often they occur and how long they last.

Second, the test run includes a security evaluation using the Intruder.io tool. This security scan will identify any critical vulnerabilities in the API that could be exploited by malicious actors. Intruder.io will conduct a series of tests to detect weaknesses like outdated software, misconfigurations, or potential unauthorised access points. The results will help us assess the overall security of the API and determine any necessary steps to strengthen it.

Existing clinical performance data

Summary of relevant previous clinical experience with the investigational device: (Information on clinical data generated by the manufacturer)

Optimisation of clinical flow in patients with dermatological conditions using Artificial Intelligence.

Objective: To validate that the device optimises the clinical flow and patient care process, decreasing the time and cost of care per patient, through greater accuracy in medical diagnosis and determining the degree of malignancy or severity.
Secondary objectives:
- To demonstrate that the device improves the ability of healthcare professionals to detect malignant or suspected malignant pigmented lesions.
- To demonstrate that the device improves the ability and accuracy of healthcare professionals to measure the degree of involvement of patients with female androgenic alopecia.
- To demonstrate that the device improves the ability and accuracy of healthcare professionals to measure the degree of involvement of patients with acne.
- To automate the initial triage/assessment process in patients consulting for pigmented lesions.
- To evaluate the reduction in the use of healthcare resources by the centre by reducing the number of triage consultations and direct referrals to the appropriate consultation (aesthetic or dermatological).
- To evaluate the degree of usability of the device by the patient.
- To demonstrate that the device increases specialist satisfaction.
- To evaluate the reduction in the use of healthcare resources by reducing the number of triage consultations and directing the patient directly to the appropriate consultation, whether in the aesthetic or dermatological field.
Design: This was a prospective observational study with both longitudinal and retrospective case series. Prospectively, a mimimum of 60 cases were included: 30 with pigmented lesions, 15 with androgenic alopecia, and 15 with inflammatory acne. Retrospectively, 60 patients with pigmented lesions, 15 with androgenic alopecia, and 15 with inflammatory acne wanted to be included. The study took five months to complete.
Results:
- For retrospective images, the medical device demonstrated an AUC of 0.76 in detecting malignancy of the lesions, compared to an AUC of 0.79 achieved by dermatologists.
- The medical device achieved a top-5 accuracy of 0.47 in diagnostic assessment, while dermatologists achieved a top-3 accuracy of 0.45. When the specific type of nevus was excluded from the diagnostic evaluation, the medical device achieved a higher top-5 accuracy of 0.78, compared to dermatologists' top-3 accuracy of 0.70.
- In the analysis of prospective images, the performance of dermatologists assisted by the medical device resulted in an AUC of 0.94. In terms of diagnostic performance, dermatologists assisted by the legacy medical device achieved a top-1 accuracy of 0.30.
- For androgenic alopecia, 49 retrospective images were collected in addition to the 13 previously obtained. The overall accuracy of the model was 47%, while the accuracy of the latest model optimised for FAA reached 53%, as assessed by the researchers.
- During the course of the study, a significantly low recruitment rate was observed for patients within the acne cohort. This insufficient volume has prevented the achievement of a minimum sample size required to ensure meaningful statistical and clinical analysis. This would have compromised the statistical power of the study, the external validity (generalisability) of any findings and the overall scientific rigor of cohort comparisons. Therefore, to avoid drawing unreliable or biased conclusions from an underpowered sample, a scientifically sound decision was made to exclude the acne cohort from the final analysis.
Conclusions: The diagnostic capability of the device in distinguishing malignancy is on par with that of expert dermatologists, not only in teledermatology but also in in-person consultations. This confirms its reliability as a screening tool for malignant neoplasms based on ICD-11 categories, aiding in patient prioritisation by urgency and facilitating appropriate referrals to specialists or consultations. Furthermore, a strong correlation was observed in Ludwig scores, despite a decline in the prospective trial, which may be attributed to inconsistencies in alignment criteria. Regarding the exclusion of the acne cohort, this amendment does not compromise the primary objective or the scientific integrity of the remaining study. The decision was made based on sound scientific, ethical, and operational considerations and has been properly documented in accordance with applicable legislation, such as Good Clinical Practice (ICH-GCP), which allow for justified protocol amendments.

Pilot study for the clinical validation of an artificial intelligence algorithm to optimise the appropriateness of dermatology referrals.

Objective: To validate that the medical device is a valid tool for improving the adequacy of referrals to dermatology.
Secondary objectives:
- To validate that the device reduces costs in secondary care.
- To validate that the device reduces dermatology waiting lists.
- To validate that the device optimises clinical flow in Osakidetza.
Design: This is a prospective, observational, and analytical study of a longitudinal series of clinical cases. The study aims to include 400 dermatological lesions, with 113 patients currently recruited. The initial estimated duration was 4 months, but an extension has been requested to continue patient recruitment.
Current status: Patient recruitment is ongoing to reach the target sample size.
Results:
- In detecting malignancy, the device demonstrated a sensitivity of 100% and a specificity of 76%. In contrast, primary care physicians did not adequately identify malignant cases, showing a sensitivity of 25% but a high specificity of 96%.
- Dermatologists opted for in-person consultations for 71% of patients, while 29% of teledermatology cases were addressed and resolved directly.
- General practitioners believed that 86% of cases should be referred to a dermatologist, but dermatologists did not schedule in-person visits for 29% of teleconsultations. It is estimated that 15% of referrals could be avoided using tools like Legit.Health, and 30% of referred cases ultimately did not require in-person visits.
- The algorithm flagged 57% of cases marked for referral as low malignancy and correctly identified 43% of unaddressed cases as low-risk. For cases not referred, all were classified as low malignancy, which was confirmed, except for one case later diagnosed as psoriasis.
- Regarding cost reduction, approximately 30% of consultations initially considered necessary for referral did not result in an in-person dermatologist visit. Around 56% of cases that appeared to require an in-person visit were found to be benign or manageable in primary care.
- The average waiting time for dermatologist consultations was 10 days.
Conclusions: It is too early to draw definitive conclusions as the study is still recruiting patients. However, these preliminary results suggest that the use of the medical device could optimise costs, reduce waiting times, and prioritise urgent cases, assuming delays in appointments due to waiting lists.

Clinical validation study of a Computer-aided diagnosis (CADx) system with artificial intelligence algorithms for early non-invasive detection of in vivo cutaneous melanoma.

Objective: To validate that the artificial intelligence algorithm developed by the manufacturer for the identification of cutaneous melanoma in images of lesions taken with a dermoscopic camera achieves the following values:
- AUC greater than 0.8.
- Sensitivity equal to or higher than 80%.
- Specificity equal to or higher than 70%.
Secondary objectives:
- To compare the performance of the artificial intelligence algorithm developed by the manufacturer with the performance of healthcare professionals of different specialisations:
  - Dermatologists.
  - Primary care physicians.
- Validate the usefulness and feasibility of the artificial intelligence algorithm developed by the manufacturer in adverse environments with severe technical limitations, such as lack of instrumentation or internet connection.
Design: This was an observational, cross-sectional case series study conducted to evaluate a diagnostic test. The proposed sample size for the study was 200 participants. By the end of the study, 105 patients were recruited. Although the sample size fell short of the target (200 subjects), the proportion of cutaneous melanoma cases was increased beyond the originally planned percentage (from 20% to 34%). The study duration was five years.
Results:
- Regarding melanoma detection, the medical device achieved a top-1 accuracy of 0.80, a top-3 sensitivity of 0.90, and a top-1 specificity of 0.80.
- For melanoma detection, the device showed an AUC of 0.84, which is considered excellent.
- In the skin recognition analysis, the device achieved a top-1 accuracy of 0.54, a top-3 accuracy of 0.75, and a top-5 accuracy of 0.84.
- Finally, the device demonstrated an AUC of 0.88 in malignancy detection.
Conclusions: The device demonstrates a high ability to predict malignancy and compelling image recognition performance for melanoma and other pigmented skin lesions, such as carcinomas, keratoses, or nevus, with results similar to those from internal validation tests. Regarding melanoma detection, the data collected in this study limit the robustness of the analysis due to class imbalances, challenging diagnoses, and inconsistent image quality. However, the results obtained remain convincing even under such challenging conditions.

Clinical Validation of a Computer-Aided Diagnosis (CAD) System Utilising Artificial Intelligence Algorithms for Continuous and Remote Monitoring of Patient Condition Severity in an Objective and Stable Manner.

Objective: The primary aim of this study is to ascertain the validity of the device, leveraging artificial intelligence developed by the manufacturer, in objectively and reliably tracking the progression of chronic dermatological conditions. This validation is deemed successful if the tool achieves a score of 8 or higher on the Clinical Utility Questionnaire (CUS).
Secondary objectives:
- Confirming that the utilisation of the device elicits a high level of patient satisfaction, particularly in its remote application.
- Demonstrating that the implementation of the device leads to a reduction in face-to-face consultations, thereby optimising healthcare resources and patient convenience.
- Validating the device's ability to consistently generate reliable condition monitoring, thereby establishing its trustworthiness as a monitoring system.
Design: This was a prospective, analytical observational study of a longitudinal series of clinical cases. A total of 160 patients were recruited over 19 months.
Results:
- Initially, 400 patients were considered for inclusion in the study. However, after screening based on predefined criteria, 240 individuals were excluded. Consequently, the final cohort included 160 patients who met the eligibility requirements.
- In the analysis of the Clinical Utility Questionnaire, the overall score, calculated by averaging normalised scores (0 to 100) across specialists and questions, was 71.39. For questions 2, 6, and 10, a score of 100 represents "yes," while 0 represents "no." For question 5, a response of "I have not reduced time" scores 0, while any other response scores 100.
- The Data Utility Questionnaire yielded an aggregate score of 87 ± 16, based on responses averaged across specialists and questions.
- The System Usability Questionnaire scored an average of 87.00, calculated by averaging responses across specialists and questions.
- The Patient Satisfaction Questionnaire achieved an overall score of 70.77, based on averages of patient responses across questions.
Conclusions: The device has proven to be highly effective, safe, and user-friendly for managing chronic dermatological conditions. Positive feedback from specialists and patients highlights its potential as a valuable clinical tool. The device demonstrates significant clinical relevance in dermatology, enabling objective monitoring in skin assessments. Additionally, it streamlines the diagnostic process, reducing the physician's workload. While the device offers substantial benefits, it is emphasised that it should complement, not replace, clinical judgment. Overall, the device promises to be a valuable tool for supporting dermatologists in clinical decision-making.

Project to enhance Dermatology E-Consultations in Primary Care Centres using Artificial Intelligence Tools.

Objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) for the diagnosis of multiple dermatological conditions.
Secondary objectives:
- Reduce and correct the referral of patients with skin pathologies from primary care to dermatology.
- Individualise and improve the ongoing training of primary care physicians in dermatology.
- Offer healthcare adapted to technological innovations.
- Measure the satisfaction of primary care physicians with the device platform.
- Measure the satisfaction of dermatologists with the device platform.
Design: This was a prospective analytical observational study of a series of clinical cases. The study did not include an active or control group, as it focused on evaluating the device in a real-world clinical setting. Primary care physicians (PCPs) served as their own control group by first diagnosing without using the device, followed by confirming or revising their diagnosis using the device. The evaluation criteria included completing questionnaires, submitting photographs, and collecting patient-reported outcomes via the device. The study employed various methods, including data collection through questionnaires, analysis of photographs, and patient satisfaction surveys.
Results:
- Primary care physicians participated in this study. A total of 180 diagnostic reports were collected from 131 patients.
- For specific conditions like hidradenitis suppurativa (HS), the device helped identify two cases that PCPs initially missed, one of which was later confirmed by a dermatologist.
- In urticaria cases, the tool confirmed one diagnosis without additional suggestions.
- For psoriasis, eight cases were confirmed, and the tool suggested psoriasis in 23 cases, leading to 10 diagnoses by PCPs.
- Regarding skin cancer, specifically melanoma, five cases were confirmed through pathology, while dermatologists suspected melanoma in 10 cases.
- With the device, specialists achieved a sensitivity of 60% and a specificity of 91% for melanoma detection. Overall, the tool achieved an AUC of 0.84 for detecting malignant neoplasms (including melanoma and carcinomas).
- Primary care physicians and dermatologists completed a survey on the clinical utility of the device, with responses from eight PCPs and two dermatologists.
Conclusions: The use of the device improved the diagnosis of skin diseases by helping PCPs identify previously undiagnosed cases of hidradenitis suppurativa and psoriasis. For melanoma detection, PCPs using the tool achieved a sensitivity of 60% and a specificity of 91%. The malignancy index of the device reached an AUC of 0.84, demonstrating strong differentiation between malignant and benign cases.

Both PCPs and dermatologists reported high satisfaction with the tool's overall performance, ease of use, diagnostic support, and efficiency in patient management. The platform also received positive feedback for prioritising and managing urgent cases.

Although this study did not analyse diagnostic accuracy improvements among PCPs due to the low volume of dermatological conditions and PCP involvement in this area, future studies will aim to evaluate diagnostic precision improvements with the use of the device.

Non-Invasive Prospective Pilot in a Live Environment for the improvement of the diagnosis of Generalised Pustular Psoriasis

Objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of generalised pustular psoriasis (GPP).
Secondary objectives: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of other dermatological skin conditions, such as hidradenitis suppurativa.
Design: This was a prospective cross-sectional study. A total of 15 healthcare professionals were recruited to participate, with each reviewing 100 images. The study lasted four months.
Results:
- On average, diagnostic accuracy increased from 47.91% to 62.81%, representing a relative improvement of 31%. For primary care physicians (PCPs), the improvement was even more pronounced, with a relative increase of 40% in correct diagnoses.
- For generalised pustular psoriasis (GPP), diagnostic accuracy improved by 24.44% with the use of the medical device. This effect was even more substantial in primary care, with a 120% increase in correct GPP diagnoses.
- For other conditions, such as hidradenitis suppurativa (HS) and palmoplantar pustulosis, PCPs correctly diagnosed 12.43% more cases, while dermatologists showed similar improvements with the device.
- Specifically for palmoplantar pustulosis, PCPs demonstrated an outstanding 146% increase in accuracy, while dermatologists maintained their already high performance.
Conclusions: Although the data from dermatologists were not statistically significant for individual pathologies due to the small sample size (only four dermatologists compared to 11 PCPs), this was not a design flaw, as the study focused on primary care. The high level of expertise among dermatologists, particularly in conditions like hidradenitis suppurativa, combined with the average complexity of HS cases, may explain why the tool had less impact for them. However, this does not diminish the potential utility of the tool for other dermatologists.

Overall, the use of the device had a substantial impact, especially in primary care and for rare conditions like generalised pustular psoriasis, significantly improving diagnostic accuracy. This improvement can enhance referral adequacy and reduce the burden on the healthcare system by enabling more effective case management at the primary care level.

Non-invasive prospective Pilot in a Live Environment for the improvement of the diagnosis of skin pathologies in primary care

Objective: To validate that the information provided by the device increases the true accuracy of primary care physicians in the diagnosis of multiple dermatological conditions.
Secondary objectives:
- To validate what percentage of cases should be referred according to the HCP with the information provided by the device.
- To validate what percentage of cases could be handled remotely with the information provided by the device.
Design: This was an analytical, observational, and cross-sectional study designed to evaluate whether the use of the medical device by healthcare professionals helps improve diagnostic accuracy for various skin conditions. The study included a single group of participants, consisting of primary care physicians (PCPs) and dermatologists. There was no control group, as the same group of professionals acted as their own control, diagnosing the same images with and without the use of the device. The study lasted four months.
Results:
- Nine primary care physicians reviewed all the images in this study.
- Without using the device, healthcare professionals achieved a diagnostic accuracy of 72.96%. When using it, diagnostic accuracy increased to 82.22%. This improvement was observed across all target conditions except for basal cell carcinoma.
- Regarding referrals, 48.89% of cases did not require a referral.
- For remote consultations, 60.74% of cases were managed effectively without the need for an in-person visit.
Conclusions: The use of the device has proven to be an effective tool for enhancing diagnostic accuracy among primary care physicians, increasing their accuracy rate from 72.96% to 82.22%. While the improvements varied depending on the skin condition, notable gains were observed in cases of hidradenitis suppurativa, urticaria, and actinic keratosis. However, statistical significance was not achieved for all pathologies due to the limited sample size.

The tool also reduced unnecessary referrals, as 49% of cases were managed effectively without the need for a specialist consultation, in contrast to previous studies showing higher referral rates to dermatology. Additionally, the device facilitated the remote management of 60.74% of cases, indicating that diagnostic support tools like this can play a key role in promoting efficient remote consultations and optimising triage processes in primary care settings.

Non-invasive prospective Pilot in a Live Environment for the Improvement of the diagnosis of skin pathologies in primary care and dermatology

Objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of multiple dermatological conditions.
Secondary objectives:
- To validate what percentage of cases should be referred according to the HCP with the information provided by the device.
- To validate what percentage of cases could be handled remotely with the information provided by the device.
- Confirm that the use of the medical device is perceived by specialists as being of great clinical utility.
Design: This was a prospective observational study involving 16 healthcare professionals: 10 primary care physicians (PCPs) and 6 dermatologists. Of these, 12 completed the entire process, while the remaining 4 reviewed a partial number of images—specifically, 28, 15, 9, and 1 images, respectively. The study lasted 4 months.
Results:
- The use of the medical device improved diagnostic accuracy from 68.08% to 88.78%. Among PCPs, accuracy increased from 62.90% to 89.92%, while dermatologists' accuracy improved from 76.47% to 86.93%. Significant improvements were observed for conditions such as tinea, granuloma annulare, and seborrheic keratosis.
- Regarding referrals, 58.1% of cases did not require a referral. This percentage was slightly higher for PCPs (60.89%) and slightly lower for dermatologists (53.59%).
- Experts agreed that acne, herpes, and tinea could be effectively managed remotely, while conditions such as melanoma and nevi required in-person care.
- Additionally, 87% of healthcare professionals considered the tool efficient, reducing consultation times to under 10 minutes.
Conclusions: The device has proven to be a valuable tool for enhancing diagnostic accuracy among both PCPs and dermatologists. The system was particularly effective in improving the diagnosis and management of conditions such as tinea, granuloma annulare, and seborrheic keratosis, while enabling more precise diagnostics across a wide range of skin conditions. Its use significantly reduced the need for referrals, allowing a large proportion of cases to be managed remotely and thereby alleviating the burden on specialised care.

Feedback from healthcare professionals highlighted the tool's utility and efficiency, particularly in supporting remote consultations and streamlining patient management. These findings suggest that the use of the device can play a crucial role in improving diagnostic workflows and optimising healthcare resources in dermatology.

Analysis of adverse device effects and any history of modification or recall

No adverse reactions to the medical device are known, nor has it been withdrawn for this or any other reason.

Investigational device risk management

Known and forseeable risks

The risk analysis for the device has been conducted in accordance with the requirements of ISO 14971:2019, ensuring that all associated risks have been identified, assessed, and appropriately mitigated to guarantee safety and regulatory compliance.

After evaluating the risks associated with the device's use, we have identified certain residual risks. The following table summarises these residual risks and the recommended measures for addressing each:

Situation	Recommended course of action
Incorrect clinical information: the care provider receives into their system data that is erroneous	The device must always be used under the supervision HCP, who should confirm or validate the output of the device considering the medical history of the patient, and other possible sympthoms they could be suffering, especially those that are not visible or have not been supplied to the device
Incorrect diagnosis or follow up: the medical device outputs a wrong result to the HCP	The device must always be used under the supervision HCP, who should confirm or validate the output of the device considering the medical history of the patient, and other possible sympthoms they could be suffering, especially those that are not visible or have not been supplied to the device. Also, we encourage you to review the metadata returned by the device about the output, such as explainability media and other metrics.
Image artefacts/resolution: the medical device receives an input that does not have sufficient quality in a way that affects its performance	The Instructions for Use contain extensive indication on how to take pictures in a section called `How to take pictures`. We also offer training to the users to improve the imaging process so that it is optimal for the device's operation; feel free to request such training to your closest sales representative. Also, we encourage you to pay attention to the information regarding image quality that the device outputs alognside the clinical information.
Data transmission failure from care provider's system: the care provider's system cannot connect to the device to send data	The Instructions for Use contain extensive indication on how to integrate the device into the care provider's system in a section called `Installation manual`.
Data input failure: the medical device cannot receive data from care providers	The Instructions for Use contain extensive indication on how to integrate the device into the care provider's system in a section called `Installation manual`.
Data accessibility failure: the care provider cannot receive data from the medical device	The Instructions for Use contain extensive indication on how to integrate the device into the care provider's system in a section called `Installation manual`.
Data transmission failure: the medical device cannot send data to care providers	The Instructions for Use contain extensive indication on how to integrate the device into the care provider's system in a section called `Installation manual`.
Inadequate lighting conditions during image capture: The medical device receives an input that does not have sufficient quality	This is similar to risk ID 9, but with a very simple solution: use the flash. If you can't use the flash and still the image is dark, move to a different environment with better lightning. Also, we encourage you to pay attention to the information regarding image quality that the device outputs alognside the clinical information.

Undesirable effects

There are no known or anticipated undesirable side effects specifically related to the use of the software.

Summary of the benefit-risk analysis including identification of residual risks

The benefit-risk analysis of the device has been conducted by considering the data from the risk assessment and the available evidence on the expected clinical benefits. After applying mitigation strategies, the identified residual risks are deemed acceptable when compared to the advantages the device offers to both patients and healthcare professionals.

This positive balance is based on the device's ability to provide accurate and detailed information that supports the diagnosis of skin diseases, enhancing clinical decision-making and, ultimately, patient care. Furthermore, the design and implemented control measures ensure that risks are minimised to a reasonable level, guaranteeing safety and compliance with applicable regulatory requirements.

Contra-indications

We advise the user not to use the device in the following situations:

Skin structures located at a distance greater than 1 cm from the eye, beyond the optimal range for examination.
Skin areas that are obscured from view, situated within skin folds or concealed in other manners, making them inaccessible for camera examination.
Regions of the skin showcasing scars or fibrosis, indicative of past injuries or trauma.
Skin structures exhibiting extensive damage, characterized by severe ulcerations or active bleeding.
Skin structures contaminated with foreign substances, including but not limited to tattoos and creams.
Skin structures situated at anatomically special sites, such as underneath the nails, requiring special attention.
Portions of skin that are densely covered with hair, potentially obstructing the view and hindering examination.

Warnings for the investigational device

If you notice any malfunction of the device, please inform us as soon as possible. You can use the email address support@legit.health. As the manufacturer, we will take the necessary steps accordingly. Any serious incident must be reported to us as well as to the national competent authority in your country.

It is worth noting that, based on our experience, no adverse side effects specifically related to the use of the software are known or anticipated.

Risk management

The risk analysis for the product has been conducted in accordance with the standard UNE-EN ISO 14971:2020 Medical devices. Application of risk management to medical devices and following our procedure GP-013 Risk Management. The entire process prioritises patient and user safety. Wherever possible, inherent risks have been eliminated through improvements in the design of the device or software, such as algorithm optimisation to reduce the likelihood of errors. Additionally, technical safety controls have been implemented, including image quality validations prior to processing, and detailed information has been provided in the Instructions for Use to mitigate risks that cannot be entirely eliminated. Controls have also been established to manage residual risks, such as continuous performance monitoring of the device through post-market surveillance activities.

The risk management plan can be found in the record R-TF-013-001 Risk Management Plan within the product's technical file. The risk analysis, benefit-risk analysis, and identification of residual risks are documented in the records R-TF-013-002 Risk Management Record and R-TF-013-003 Risk Management Report of the corresponding technical file.

Furthermore, contraindications, warnings, precautions, and potential expected adverse events for the investigational product are detailed in the product's Instructions for Use.

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

Author: Team members involved
Reviewer: JD-003, JD-004
Approver: JD-001

Introduction​

Identification of the Investigator's Brochure (IB)​

Title of the clinical investigation​

Investigational device​

IB Reference Number​

Protocol code​

Confidentiality statement​

Principal Investigator​

Sponsor/Manufacturer​

Investigational device information​

Summary of the literature and evaluation supporting the rationale for the design and intended use of the investigational device​

Statement concerning the regulatory classification of the investigational device​

General description of the investigational device​

Summary of relevant manufacturing processes and related validation processes​

Manufacturer's instructions for installation (installation maintenance, storage requirements, manipulation)​

Sample of the label (for example sticker or copy, and instructions for use or reference to, and information on any training required).​

Description of the intended clinical performance​

Intended use

Quantification of intensity, count and extent of visible clinical signs

Image-based recognition of visible ICD categories

Device description

Intended medical indication

Intended patient population

Intended user

User qualifications and competencies

Healthcare professionals

IT professionals

Use environment

Operating principle

Body structures

Explainability

Preclinical testing​

Design calculations​

The user receives quantifiable data on the intensity of clinical signs​

Objective​

Acceptance Criteria​

Materials & Methods​

Model Training & Evaluation​

Results of the test​

Conclusions​

The user receives quantifiable data on the count of clinical signs​

Objective​

Acceptance Criteria​

Materials & Methods​

Ground Truth Generation​

Data Splitting​

Model Training​

Results of the test​

Nodule, Abscess, and Draining Tunnel Detection​

Inflammatory Lesion Detection​

Hive Detection​

Protocol Deviations​

Conclusions​

The user receives quantifiable data on the extent of clinical signs​

Objective​

Acceptance Criteria​

Materials & Methods​

Ground Truth Generation​

Data on the extent of clinical signs outcomes​

Protocol Deviations​

Conclusions​

The user receives an interpretative distribution representation of possible ICD categories represented in the pixels of the image​

Objective​

Materials & Methods​

Dataset​

Data Stratification & Augmentation​

Model Architecture & Training​

ICD categories in clinical images findings​

Protocol Deviations​

Conclusions​

Validation of software relating to the function of the device​

Performance tests​

Test: If something does not work, the API returns meaningful information about the error​

Test: Notify the user image modality and if the image does not represent a skin structure​

Objective​

Acceptance Criteria​

Materials & Methods​

Validation of device performance​

Conclusions​

Test: Notify the user if the quality of the image is insufficient​

Introduction

Identification of the Investigator's Brochure (IB)

Title of the clinical investigation

Investigational device

IB Reference Number

Protocol code

Confidentiality statement

Principal Investigator

Sponsor/Manufacturer

Investigational device information

Summary of the literature and evaluation supporting the rationale for the design and intended use of the investigational device

Statement concerning the regulatory classification of the investigational device

General description of the investigational device

Summary of relevant manufacturing processes and related validation processes

Manufacturer's instructions for installation (installation maintenance, storage requirements, manipulation)

Sample of the label (for example sticker or copy, and instructions for use or reference to, and information on any training required).

Description of the intended clinical performance

Preclinical testing

Design calculations

The user receives quantifiable data on the intensity of clinical signs

Objective

Acceptance Criteria

Materials & Methods

Model Training & Evaluation

Results of the test

Conclusions

The user receives quantifiable data on the count of clinical signs

Objective

Acceptance Criteria

Materials & Methods

Ground Truth Generation

Data Splitting

Model Training

Results of the test

Nodule, Abscess, and Draining Tunnel Detection

Inflammatory Lesion Detection

Hive Detection

Protocol Deviations

Conclusions

The user receives quantifiable data on the extent of clinical signs

Objective

Acceptance Criteria

Materials & Methods

Ground Truth Generation

Data on the extent of clinical signs outcomes

Protocol Deviations

Conclusions

The user receives an interpretative distribution representation of possible ICD categories represented in the pixels of the image

Objective

Materials & Methods

Dataset

Data Stratification & Augmentation

Model Architecture & Training

ICD categories in clinical images findings

Protocol Deviations

Conclusions

Validation of software relating to the function of the device

Performance tests

Test: If something does not work, the API returns meaningful information about the error

Test: Notify the user image modality and if the image does not represent a skin structure

Objective

Acceptance Criteria

Materials & Methods

Validation of device performance

Conclusions

Test: Notify the user if the quality of the image is insufficient

Objective

Acceptance Criteria

Materials & Methods

Dataset

Notification of insufficient image quality

Protocol Deviations

Conclusions

Test: The user specifies the body site of the skin structure

Objective

Acceptance criteria

Material & methods

Specification of body site of the skin structure. Results

Protocol deviations

Conclusions