R-TF-015-006 Clinical investigation report
Research Title
Optimization of clinical flow in patients with dermatological conditions using Legit.Health Plus.
Description
Clinical study validation using Legit.Health to assess diagnostic accuracy and severity assessment. In this study, patients with pigmented lesions and androgenic alopecia from the dermatological clinic IDEI were recruited for analysis.
Product identification
| Information | |
|---|---|
| Device name | Legit.Health Plus (hereinafter, the device) |
| Model and type | NA |
| Version | 1.1.0.0 |
| Basic UDI-DI | 8437025550LegitCADx6X |
| Certificate number (if available) | MDR 792790 |
| EMDN code(s) | Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software) |
| GMDN code | 65975 |
| EU MDR 2017/745 | Class IIb |
| EU MDR Classification rule | Rule 11 |
| Novel product (True/False) | TRUE |
| Novel related clinical procedure (True/False) | TRUE |
| SRN | ES-MF-000025345 |
Sponsor identification and contact
| Manufacturer data | |
|---|---|
| Legal manufacturer name | AI Labs Group S.L. |
| Address | Street Gran Vía 1, BAT Tower, 48001, Bilbao, Bizkaia (Spain) |
| SRN | ES-MF-000025345 |
| Person responsible for regulatory compliance | Alfonso Medela, Saray Ugidos |
| office@legit.health | |
| Phone | +34 638127476 |
| Trademark | Legit.Health |
| Authorized Representative | Not applicable (manufacturer is based in EU) |
Identification of the Clinical Investigation Plan (CIP)
| CIP | |
|---|---|
| Title of the clinical investigation | Optimisation of clinical flow in patients with dermatological conditions using Artificial Intelligence |
| Device under investigation | Legit.Health Plus |
| Protocol version | Version 12.0 |
| Date | 2023-12-27 |
| Protocol code | Legit.Health_IDEI_2023 |
| Sponsor | AI Labs Group S.L. |
| Coordinating Investigator | Dr. Miguel Sánchez Viera |
| Principal Investigator(s) | Dr. Miguel Sánchez Viera |
| Investigational site(s) | Instituto de Dermatología Integral (IDEI) |
| Ethics Committee | Comité de Ética de la Investigación con Medicamentos de HM Hospitales (Reference: 24.12.2266-GHM) |
Public Access Database
Please note that the database used in this study is not publicly accessible due to privacy and confidentiality considerations.
Research team
Principal Investigators
- Dr. Miguel Sánchez Viera (Instituto de Dermatología Integral, IDEI)
Collaborating Investigators
- IDEI, Instituto de Dermatología Integral
- Dr. Concetta D'Alessandro
- Dr. Alejandra Capote
- Dr Pablo Lopez Andina
- Dr. Allison Marie Bell-Smythe Sorg
- Dr. Alejandra Vallejos
- Dr. Isabel del Campo
- Dr. Juliana Machado
- Dr. Raúl Lucas Escobar
- Beatriz Torres
- Legit.Health (AI Labs Group S.L.)
- Alfonso Medela
- Taig Mac Carthy
Investigational site
- Instituto de Dermatología Integral (IDEI)
Compliance Statement
The clinical investigation was perforfed according to the Clinical Investigation Plan (CIP) and other applicable guidances and regulations. This includes compliance with:
- Harmonized standard
UNE-EN ISO 14155:2021 Regulation (EU) 2017/745 on medical devices (MDR)- Harmonized standard
UNE-EN ISO 13485:2016s Regulation (EU) 2016/679(GDPR).- Spanish
Organic Law 3/2018on the Protection of Personal Data and guarantee of digital rights.
All data processing within the device is carried out in accordance with the highest standards of data protection and privacy. Patient information is managed in an encrypted manner to ensure confidentiality and security.
The research team assumes the role of Data Controller, responsible for the collection and management of study data. Legit.Health acts as the Data Processor and is not involved in the processing of patient data.
The storage and transfer of data comply with European data protection regulations. At the conclusion of the study, all information stored in the device will be permanently and securely deleted.
The device employs robust technical and organizational security measures to safeguard personal data against unauthorized access, alteration, loss, or processing.
Report date
October 20, 2024
Report author(s)
The full name, the ID and the signature for the authorship, as well as the approval process of this document, can be found in the verified commits at the repository. This information is saved alongside the digital signature, to ensure the integrity of the document.
Table of contents
Table of contents
- Research Title
- Description
- Product identification
- Sponsor identification and contact
- Identification of the Clinical Investigation Plan (CIP)
- Public Access Database
- Research team
- Compliance Statement
- Report date
- Report author(s)
- Table of contents
- Abbreviations and definitions
- Summary
- Introduction
- Materials and methods
- Results
- Discussion and overall Conclusions
- Ethical considerations
- Investigators and administrative structure of clinical research
- Report annexes
Abbreviations and definitions
- AE: Adverse Event
- AEMPS: Spanish Agency of Medicines and Medical Devices
- AEP: Adverse Reaction to Product
- AUC: Area Under the ROC Curve
- CAD: Computer-Aided Diagnosis
- CMD: Data Monitoring Committee
- CIP: Clinical Investigation Plan
- CUS: Clinical Utility Questionnaire
- DLQI: Dermatology Quality of Life Index
- GCP: Standards of Good Clinical Practice
- ICH: International Conference of Harmonization
- IFU: Instructions For Use
- IRB: Institutional Review Board
- N/A: Not Applicable
- NCA: National Competent Authority
- PI: Principal Investigator
- PPV: Positive Predictive Value
- NPV: Negative Predictive Value
- SAE: Serious Adverse Events
- SAEP: Serious Adverse Event to Product
- SUAEP: Serious and Unexpected Adverse Event to the Product
- SUS: System Usability Scale
Summary
This is an observational study, both prospective and retrospective, of a series of clinical cases designed to validate whether the device, Legit.Health Plus, powered by artificial intelligence, can effectively support diagnostic accuracy and severity assessment of dermatological conditions. The study focused on evaluating patients treated at IDEI, with a total inclusion of 202 patients, covering cases of pigmented lesions and androgenic alopecia. The primary objective was to assess the device's performance in providing diagnostic accuracy comparable to dermatologists and in assessing severity of conditions.
Title
Optimization of clinical flow in patients with dermatological conditions using Legit.Health Plus.
Introduction
Image-based artificial intelligence (AI) holds great potential to enhance diagnostic accuracy in the medical field. During the COVID-19 pandemic, limited access to in-person healthcare services accelerated the adoption of telemedicine, highlighting the importance of AI in triage and decision-making to help professionals manage workloads and improve efficiency. In dermatology, common conditions like pigmented lesions and alopecia require significant resources for triage, clinical evaluation, and follow-up. AI tools can help reduce these demands and optimize workflows.
Advances in image recognition and AI have driven innovations in diagnosing various conditions, including skin disorders. Computer-Aided Diagnosis (CAD) systems and algorithm-based technologies have proven capable of classifying lesion images with expertise comparable to that of a skilled dermatologist.
The primary goal of this study is to validate whether the device improves clinical workflow efficiency and patient care by accurately diagnosing and determining the severity of lesions. This will reduce the need for in-person consultations and associated costs, ensuring patients are directed to the appropriate consultation types. Secondary objectives include reducing wait times for patients with varying degrees of medical urgency, decreasing the number of initial dermatology consultations, improving specialist satisfaction and patient usability, and indirectly benefiting the clinic economically.
This innovation presents a significant opportunity to enhance clinical practice in private clinics, particularly in managing patients with pigmented lesions. By optimizing care processes and clinical workflows, the device has the potential to positively impact the quality of life for patients with these dermatological conditions. Additionally, this advanced technology could facilitate the early detection of severe skin cancer cases, enabling more effective and timely treatment and follow-up for at-risk patients.
Objectives
Hypothesis
Legit.Health improves diagnostic accuracy and severity assessment of dermatological conditions, providing reliable support for dermatologists.
Primary objectives
- To validate that the device increases accuracy during medical diagnosis, during the determination of malignancy and the measurement of the severity, leading to optimisations in the patient care process, such as decreasing time to care or cost.
Secondary objectives
Secondary objectives focus on measuring the diagnostic performance of the device and cost-effectiveness. More specifically:
- To demonstrate that the device improves the ability of healthcare professionals to detect malignant or suspected malignant pigmented lesions.
- To demonstrate that the device improves the ability and accuracy of healthcare professionals in measuring the degree of involvement of patients with female androgenic alopecia.
- To assess the potential cost-effectiveness of the use of Legit.Health medical device in the clinical workflow of dermatology clinics.
Population
Adult patients (≥ 18 years) with skin pathologies seen at IDEI. These patients should be diagnosed with pigmented lesions or androgenetic alopecia.
Sample size
The goal of this study is to assess the impact of the device on clinical decision-making for pigmented lesions and androgenic alopecia. Clinical decision-making is defined as the device's influence on diagnostic accuracy, time to diagnosis, and confidence in treatment decisions. To ensure robust and meaningful conclusions, the sample must reflect real-world clinical scenarios involving these conditions. The inclusion of both retrospective and prospective data for pigmented lesions ensures diversity in lesion presentation, ranging from benign to malignant cases, which is essential for evaluating the device's performance across a broad spectrum of clinical situations.
For pigmented lesions, retrospective data will provide insights into historical patterns of diagnosis, while prospective data will capture more recent cases, offering a comprehensive view of the device's utility. Retrospective data will provide insights into historical diagnostic patterns, while prospective data will reflect more recent cases, capturing the device's utility in real-time clinical scenarios. This approach is particularly relevant as pigmented lesions are a common concern in dermatology consultations, accounting for approximately 20-30% of cases (Moreno et al., 2005), making them a critical focus for assessing the device's performance across a broad spectrum of clinical situations.
For alopecia, the distribution of 15 prospective and 15 retrospective cases ensures that both active, evolving cases (prospective) and established cases (retrospective) are evaluated. Such an approach enhances the generalizability of the findings by encompassing different stages and presentations of these conditions.
The sample size was calculated to detect clinically meaningful differences with sufficient statistical power. A total of 90 cases for pigmented lesions (30 prospective and 60 retrospective) and 30 cases for alopecia provides 80% power to detect moderate effect sizes at an alpha level of 0.05. This calculation ensures the ability to identify clinically significant differences in the device's diagnostic accuracy across the included conditions.
By combining retrospective and prospective data, the study design offers a comprehensive evaluation of the device's impact on clinical decision-making. The inclusion of diverse case types for each condition, coupled with a robust power analysis, ensures statistically reliable and clinically meaningful results. These findings will contribute to a deeper understanding of the device's role in enhancing diagnostic accuracy in dermatology.
Design and methods
Design
This is a prospective observational study with both longitudinal and retrospective case series.
Number of subjects
Prospectively, a minimum of 45 cases will be included:
- 30 with pigmented lesions.
- 15 with androgenic alopecia.
Retrospectively, 60 patients with pigmented lesions and 15 with androgenic alopecia will be included.
The sample size was estimated based on the number of patients that the IDEI Dermatology Unit can care for. The data collected from these patients during the study period will be analyzed, and depending on the results obtained it will be assessed whether it is necessary to extend the sample size to include more patients.
By the time of the report, we have recruited:
- 76 retrospective patients with pigmented lesions (88 lesions).
- 32 prospective patients with pigmented lesions (42 lesions).
- 62 retrospective patients with androgenetic alopecia.
- 34 prospective patients with androgenetic alopecia.
Initiation date
January 25th, 2024.
Completion date
The study concluded on August 23rd, 2024.
Duration
This study estimated a recruitment period of 3 months.
The total duration of the study was estimated to be 6 months, including the previous time for retrospective analysis and the time required after the recruitment of the last subject for closing and editing the database, data analysis, and preparation of the final study report.
The total duration of the study for each participant with pigmented lesions was 1-3 months. The duration for patients with alopecia will be 1 day.
Methods
This study employed both a prospective and retrospective observational analytical design to assess if the medical device can effectively provide diagnostic support and severity assessment for dermatological conditions. This investigation included at the time of this report 202 patients with pigmented lesions or androgenic alopecia. Data collection included photograph analysis, severity assessment and the use of questionnaires. The study adhered to strict ethical guidelines, ensuring patient confidentiality and compliance with international standards. Patients were provided with detailed information and informed consent. Python programming language will be used as statistical software.
Results
We collected a substantial number of images from 108 patients with pigmented lesions and 96 patients with androgenetic alopecia, combining both retrospective and prospective data.
Acceptance Criteria Verification
The Clinical Investigation Plan specified the following acceptance criteria:
- top-1 accuracy equal to or greater than 61.80%.(User Group: Dermatologists)
- top-1 accuracy equal to or greater than 50.00%.(User Group: Dermatologists)
- top-3 accuracy equal to or greater than 60.00%.(User Group: Dermatologists)
- top-5 accuracy equal to or greater than 80.00%.(User Group: Dermatologists)
- AUC (area under the ROC curve) equal to or greater than 80.00% detecting malignancy.(User Group: Dermatologists)
- sensitivity equal to or greater than 80.00% detecting malignancy.(User Group: Dermatologists)
- specificity equal to or greater than 84.00% detecting malignancy.(User Group: Dermatologists)
- PPV (positive predictive value) equal to or greater than 80.00% detecting malignancy.(User Group: Dermatologists)
- NPV (negative predictive value) equal to or greater than 95.00% detecting malignancy.(User Group: Dermatologists)
- correlation equal to or greater than 50.00%.(User Group: Dermatologists)
- unweighted Kappa equal to or greater than 60.00%.(User Group: Dermatologists)
For pigmented lesions, 87.5% of the retrospective images were dermatoscopic and the rest were clinical. In the case of the prospective images, all of them were clinical. In the case of alopecia all images were clinical, with no trichoscopic photographs included. The medical device demonstrated an AUC of 0.7338 (95% CI: [0.5971-0.8554]) in detecting lesion malignancy from retrospective images, while the dermatologists achieved an AUC of 0.7738 (95% CI: [0.6345-0.8908]). In the same source of images, the medical device achieved a top-5 accuracy of 0.4651: 40/86 (95% CI: [0.3563-0.5730]) when doing the diagnosis assessment, while the dermatologists achieved a 0.4535: 39/86 (95% CI: [0.3448-0.5618]) top-3 accuracy. When not accounting for the specific kind of nevus in the diagnosis, the medical device achieves a superior top-5 accuracy of 0.7791: 67/86 (95% CI: [0.6897-0.8636]) and the dermatologists achieve a top-3 accuracy of 0.6977: 60/86 (95% CI: [0.5862-0.8072]).
In the analysis of prospective images, we evaluate the performance of dermatologists and the medical device. In malignancy analysis, they achieve an AUC of 0.9430 (95% CI: [0.8132-1.0000]) and 0.9669 (95% CI: [0.8889-1.0000]) respectively. Regarding diagnosis performance, the dermatologists achieved a top-1 accuracy of 0.2857: 8/28 (95% CI: [0.1250-0.4832]) and the medical device achieved a top-5 accuracy of 0.5000: 14/28 (95% CI: [0.3000-0.7308]). When not accounting for the specific kind of nevus in the diagnosis, the accuracies are increased to 0.8214: 23/28 (95% CI: [0.6399-0.9488]) for dermatologists and 0.8929: 25/28 (95% CI: [0.7500-1.0000]) for the medical device.
For androgenetic alopecia, we collected 49 retrospective images in addition to 13 previously obtained. The optimized AI model showed a correlation of 0.77 on this earlier dataset. In the prospective test, 34 images were analyzed without parameter tuning, ensuring an unbiased evaluation of the algorithm's performance. The overall accuracy of the model was 47%, while the accuracy of the latest model optimized for FAA was 53%, based on the investigator's scores. This suggests that the device algorithm can still benefit from further data integration and model optimization.
Conclusions
The device's diagnostic capability in distinguishing malignancy is on par with expert dermatologists, not only in teledermatology but also at in-person consultations. This confirms its reliability as a screening tool for malignant ICD-11 categories, helping to prioritize patients based on urgency and direct them to the appropriate specialist or consultation.
Additionally, we observed a strong correlation in Ludwig scores, despite a decline in the prospective trial, which may be attributed to inconsistencies in criteria alignment.
Introduction
Image-based Artificial Intelligence (AI) holds great potential to enhance diagnostic accuracy in the medical field. During the COVID-19 pandemic, limited access to in-person healthcare services accelerated the adoption of telemedicine, highlighting the importance of AI in triage and decision-making to help professionals manage workloads and improve efficiency. In dermatology, common conditions like pigmented lesions and alopecia require significant resources for triage, clinical evaluation, and follow-up. AI tools can help reduce these demands and optimize workflows.
Advances in image recognition and AI have driven innovations in diagnosing various conditions, including skin disorders. Computer-Aided Diagnosis (CAD) systems and algorithm-based technologies have proven capable of classifying lesion images with expertise comparable to that of a skilled dermatologist.
This study evaluates the device, an AI tool developed by AI Labs Group S.L., which aims to provide diagnostic support and severity assessment in dermatology. The tool automatically analyzes images, detects malignant pigmented lesions, assesses severity of conditions, and provides a visual record (photograph) for external experts to review.
The primary goal of this study is to validate whether the device provides diagnostic accuracy comparable to dermatologists and can assess the severity of dermatological conditions. Secondary objectives include assessing the device's effectiveness in detecting malignant lesions and its cost-effectiveness potential in clinical practice.
This innovation presents a significant opportunity to enhance clinical practice in private clinics, particularly in managing patients with pigmented lesions. By providing accurate diagnostic support, the device has the potential to positively impact the quality of care for patients with these dermatological conditions. Additionally, this advanced technology could facilitate the early detection of severe skin cancer cases, enabling more effective and timely treatment and follow-up for at-risk patients.
Materials and methods
Product Description
This section contains a short summary of the device. A complete description of the intended purpose, including device description, can be found in the record Legit.Health Plus description and specifications.
Product description
The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.
The generated data is intended to aid healthcare practitioners and organizations in their clinical decision-making process, thus enhancing the efficiency and accuracy of care delivery.
The device should never be used to confirm a clinical diagnosis. On the contrary, its result is one element of the overall clinical assessment. Indeed, the device is designed to be used when a healthcare practitioner chooses to obtain additional information to consider a decision.
Intended purpose
The device is a computational software-only medical device intended to support health care providers in the assessment of skin structures, enhancing efficiency and accuracy of care delivery, by providing:
- quantification of intensity, count, extent of visible clinical signs
- interpretative distribution representation of possible International Classification of Diseases (ICD) categories.
Intended previous uses
No specific intended use was designated in prior stages of development.
Product changes during clinical research
The device maintained a consistent performance and features throughout the entire clinical research process. No alterations or modifications were made during this period.
Clinical Investigation Plan (CIP)
This is an observational study, both prospective and retrospective, of a series of clinical cases designed to validate whether the device, powered by artificial intelligence, can effectively provide diagnostic support and severity assessment for dermatological conditions. The first part of the study focused on evaluating patients treated at IDEI, with a total inclusion of 202 patients, covering cases of pigmented lesions and androgenic alopecia. The primary objective was to assess the device's performance in providing diagnostic accuracy comparable to dermatologists and in assessing severity of conditions.
Objectives
The primary objective is to validate that the device increases accuracy during medical diagnosis, during the determination of malignancy and the measurement of the severity, leading to optimisations in the patient care process, such as decreasing time to care or cost. Secondary objectives include demonstrating the device's effectiveness in detecting malignant lesions.
Design
This is an observational study, both prospective and retrospective, focusing on a series of clinical cases. The study did not include an active or control group, as it aimed to evaluate the performance of the device in a real-world clinical setting. The assessment relied on photograph submissions through the device platform, with the study centered on analyzing these images. Additionally, retrospective images taken outside the device platform were also included and analyzed separately as part of the retrospective study.
Ethical considerations
This study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. As applicable, approval from the relevant Ethics Committee was obtained prior to the initiation of the study. When applicable, modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Participants were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The PI was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Participants were provided with a copy of their signed consent form for their records.
Data quality assurance
The Principal Investigator is responsible for reviewing and approving the study protocol and its possible modifications in the future, signing the Principal Investigator's commitment, guaranteeing that the persons involved in the centre will respect the confidentiality of patient information and protect personal data, and reviewing and approving the final study report. All members of the research team will assess the eligibility of the study patients, inform and request written informed consent, collect the study source data in the clinical record and transfer them to the Data Collection Forms (DCF).
Subject population
The study enrolled patients that fulfilled the following criteria:
Inclusion criteria
- Patients aged 18 years or older.
- Patients with pigmented lesions who meet any of the following conditions:
- Who consult for the first time for any pigmented lesion.
- Patients who have already had a dermoscopy appointment for the first time or a check-up of pigmented lesions.
- Women with androgenic alopecia.
Exclusion criteria
- Patients who at the investigator's discretion cannot or will not comply with the study procedures.
The study will prospectively include a minimum of 45 cases: 30 with pigmented lesions and 15 with androgenic alopecia. In the retrospective analysis, 60 patients with pigmented lesions and 15 with androgenic alopecia will also be included.
Treatment
Patients in this study did not receive any specific treatment as part of the research protocol.
Concomitant medication/treatment
Patients continued their regular prescribed medications and treatments as directed by their primary healthcare providers. No additional medications or treatments were administered as part of this study.
Follow-Up Duration
This study did not require a follow-up of the subjects. Every patient only got their skin lesions photographed at the time of visit.
Statistical analysis
For the evaluation of diagnostic performance, the pathological examination served as the gold standard across both retrospective and prospective studies. Several statistical techniques were employed to analyze and compare the performance of dermatologists and the medical device. Area Under the Curve (AUC), sensitivity, specificity, and positive and negative predictive values (PPV/NPV) were calculated to assess the diagnostic performance of dermatologists and the medical device in detecting malignancy.
For skin lesion recognition, top-K accuracy was calculated. These metrics measured how often the correct diagnosis was among the top predictions from the device and dermatologists. It was calculated for dermatologists and the medical device. For this study, we set the value of K to 1, 3, and 5. Differences in diagnostic performance between retrospective and prospective studies were assessed, attributing improvements to the homogeneity of the prospective lesion dataset and the assistance provided by the medical device.
For the analysis of androgenic alopecia, a correlation analysis was performed to compare the investigator's Ludwig score and the algorithm developed to assess the severity of androgenic alopecia. Additionally, unweighted Kappa coefficients were calculated to assess inter-rater agreement between the device's predicted Ludwig score and the investigator's assessment, providing a measure of agreement beyond what would be expected by chance.
Derived from performanceClaims.ts (for comparison):
studyCode or folderSlug prop, or ensure this component is used within an Investigation document with a registered folder slug.Results
Initiation and completion date
The study started on 2024-01-25 and included 202 subjects. It concluded on 2024-08-23.
Subject and investigational product management
This study included 202 patients treated at IDEI. It included 76 retrospective patients with pigmented lesions (88 lesions), 32 prospective patients with pigmented lesions (42 lesions), 62 retrospective patients with androgenetic alopecia and 34 prospective patients with androgenetic alopecia.
The investigational products were stored and handled following strict protocols. This included proper storage conditions, handling procedures, and documentation of product usage. The accountability and traceability of investigational products were rigorously maintained throughout the study.
Subject demographics
All participants in this study were from Spain and Caucasians.
Clinical Investigation Plan (CIP) compliance
The study adhered to all aspects outlined in the CIP. This ensured that the research was conducted in accordance with established protocols, procedures, and ethical standards. Any deviations from the CIP were duly documented and appropriately addressed. The compliance with the CIP was rigorously monitored throughout the study to uphold the integrity and validity of the research findings.
Protocol Deviations
The study experienced some protocol deviations from the original Clinical Investigation Plan, all of which were documented and appropriately addressed:
1. Increased Sample Size (Positive Deviation) The original CIP specified a minimum target of 90 cases for pigmented lesions (30 prospective and 60 retrospective) and 30 cases for alopecia (15 prospective and 15 retrospective), totaling 120 subjects. However, due to the availability of additional eligible patient data within the IDEI database that met all inclusion criteria, the study team expanded the dataset to 202 total subjects (76 retrospective pigmented lesions, 32 prospective pigmented lesions, 62 retrospective alopecia, and 34 prospective alopecia). This positive deviation resulted in enhanced statistical power and more robust evidence for the device's performance across diverse clinical presentations, thereby strengthening the scientific conclusions of the investigation.
2. Image Exclusions Due to Diagnostic Confirmation Limitations (Minor Deviation) During the retrospective analysis, 1 image was excluded from the final pathology diagnosis analysis due to ambiguity regarding which lesion within a multi-lesion image should be examined. Additionally, in the prospective analysis, 15 out of 42 lesions (36%) initially assessed for pathology diagnosis could not be included in the diagnosis accuracy evaluation due to lack of confirmed pathological examination results. These cases, while included in descriptive statistics and safety analysis, were excluded from diagnosis accuracy comparisons where pathological confirmation served as the gold standard. This reflects practical limitations of retrospective and prospective clinical practice, where not all lesions are biopsied—biopsies are performed only when clinical suspicion warrants histopathological confirmation.
Both deviations were appropriately documented and did not compromise the scientific validity or ethical integrity of the investigation.
Analysis
Pigmented lesions
Introduction
To validate the performance of the device in distinguishing between malignant and benign skin lesions, we conducted both retrospective and prospective studies.
Datasets
The dataset includes 88 images sourced from 76 distinct retrospective patients, of which 77 images are dermoscopic and 11 are clinical ones. Each lesion counts with only one image. Of the clinical images, 10 were manually cropped to focus on the lesion area and enhance the precision of medical device analysis. Additionally, one extra retrospective clinical image, which falls outside the total set of 88, was excluded from this report due to ambiguity regarding which lesion within the image should be examined.
The dataset also includes 120 images of 42 lesions sourced from 32 different prospective patients. Each lesion counts with up to 3 images. All of these prospective images are clinical. Prospective lesions are also provided with the dermatologist's recommendation related to their extirpation.
Methodology
We used the ICD-11 categories to calculate the probability of malignancy by summing the probabilities of categories identified as malignant. This approach is based on the post-processing of the output from an image-based recognition model for visible ICD categories, rather than an independent algorithm.
Malignancy scores were calculated for each retrospective and prospective image. Dermatologists diagnosed the cases, and those suspected of skin cancer were biopsied and confirmed through pathological examination, which served as the gold standard. Additionally, investigators assigned a suspicion score from 0 to 10 based on their clinical judgment. These suspicion scores, along with the diagnoses, were used to determine the sensitivity and specificity of the system.
As diagnoses from both dermatologists and pathological examinations—unlike those from the medical device—are presented in plain text, they do not necessarily adhere to the ICD-11 international classification standard. To enable comparison and analysis, these diagnoses have been manually translated into their closest matching ICD-11 categories, among those recognized by the medical device. However, there are cases where this translation may lack the necessary precision for a perfect match. For instance, a dermatologist's diagnosis of carcinoma may not align exactly with a pathological examination identifying squamous cell carcinoma. While both are malignant, in their diagnosis evaluation both outcomes do not match.
Androgenetic alopecia
Introduction
To estimate the performance of the device algorithm in predicting feminine androgenetic alopecia (FAA) by automatically computing the Ludwig score, two analyses were conducted:
- Retrospective analysis: This analysis utilized all 62 images provided for the initial retrospective study. These images were used to search for the best hyperparameters for the neural networks to extract the Ludwig score.
- Prospective analysis: This analysis involved 34 images set aside for prospective evaluation. These images were not used to tune the model, ensuring an unbiased assessment of the model's performance.
Datasets
The dataset comprises 96 images of patients with varying degrees of FAA, collected by expert dermatologists. The dataset is divided as follows:
- 13 images initially received.
- 49 images for retrospective analysis.
- 34 images for prospective analysis.
The first two groups were used to tune the device models for predicting the Ludwig score, and the third set was used for model evaluation.
Methodology
The algorithm designed to determine the Ludwig score is composed of three parts:
- Head cropper: Crops the area of the head from the image.
- Scalp and alopecia segmentation: Segments of the total scalp and the part affected by alopecia.
- Ludwig score computation: Computes the Ludwig score.
Head Detector
An object detector based on the YOLO architecture was employed to identify and predict the bounding box of the head in the input image, focusing on regions critical for estimating the severity of alopecia.
Scalp and alopecia segmentation
A ResNet50 encoder extracts features from the image, which are then input for the decoder forming a UNet. This UNet segments the scalp and areas of hair loss. This model was trained on large external datasets covering various cases of alopecia with different degrees, perspectives, illumination, and resolution.
Ludwig score computation
After cropping and segmentation, the percentage of alopecia predicted by the model is calculated. The Ludwig score is derived from the alopecia percentage using the following equation:
Where:
- are the total number of pixels that cover the scalp in the image
- are the number of pixels covered with hair.
The counts of and depend on the threshold used to convert the logits to their categorical prediction, which affects the Ludwig score. Additionally, the head cropper's hyperparameters influence the pixel counts. To determine the optimal hyperparameters, we used two search methods: grid search and Bayesian optimization. The grid search ensures an exhaustive exploration of the configuration space, while Bayesian optimization uses probabilistic theory to optimize the search more finely.
Results and discussion
Pigmented lesions
The evaluation of diagnostic performance for pigmented lesions relied on pathological examination as the gold standard. In the following section, we present the findings from this evaluation, encompassing both retrospective and prospective studies.
Retrospective analysis
We conducted a thorough inspection of malignancy estimation performance of the dermatologists and the device by computing the sensitivity, specificity, F1 score, positive predictive value (PPV), and negative predictive value (NPV) for several malignancy thresholds. These results showed the superior performance of the device in terms of specificity and PPV.
| Threshold | Sensitivity (derm.) | Specificity (derm.) | PPV (derm.) | NPV (derm.) | F1 (derm.) | Sensitivity (device) | Specificity (device) | PPV (device) | NPV (device) | F1 (device) |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.00 | 1.0000 | 0.0260 | 0.2424 | 1.0000 | 0.3902 | 1.0000 | 0.0000 | 0.2376 | 0.0000 | 0.3840 |
| 0.05 | 1.0000 | 0.0260 | 0.2424 | 1.0000 | 0.3902 | 0.8750 | 0.3636 | 0.3000 | 0.9032 | 0.4468 |
| 0.10 | 0.8750 | 0.3377 | 0.2917 | 0.8966 | 0.4375 | 0.7500 | 0.5195 | 0.3273 | 0.8696 | 0.4557 |
| 0.15 | 0.8750 | 0.3377 | 0.2917 | 0.8966 | 0.4375 | 0.7083 | 0.6234 | 0.3696 | 0.8727 | 0.4857 |
| 0.20 | 0.8750 | 0.5584 | 0.3818 | 0.9348 | 0.5316 | 0.6667 | 0.6883 | 0.4000 | 0.8689 | 0.5000 |
| 0.25 | 0.8750 | 0.5584 | 0.3818 | 0.9348 | 0.5316 | 0.6667 | 0.7273 | 0.4324 | 0.8750 | 0.5246 |
| 0.30 | 0.8750 | 0.5844 | 0.3962 | 0.9375 | 0.5455 | 0.6250 | 0.7792 | 0.4687 | 0.8696 | 0.5357 |
| 0.35 | 0.8750 | 0.5844 | 0.3962 | 0.9375 | 0.5455 | 0.5833 | 0.7922 | 0.4667 | 0.8592 | 0.5185 |
| 0.40 | 0.8750 | 0.5844 | 0.3962 | 0.9375 | 0.5455 | 0.5417 | 0.7922 | 0.4483 | 0.8472 | 0.4906 |
| 0.45 | 0.8750 | 0.5844 | 0.3962 | 0.9375 | 0.5455 | 0.5000 | 0.8312 | 0.4800 | 0.8421 | 0.4898 |
| 0.50 | 0.6667 | 0.7273 | 0.4324 | 0.8750 | 0.5246 | 0.4583 | 0.8312 | 0.4583 | 0.8312 | 0.4583 |
| 0.55 | 0.6667 | 0.7273 | 0.4324 | 0.8750 | 0.5246 | 0.4583 | 0.8701 | 0.5238 | 0.8375 | 0.4889 |
| 0.60 | 0.6667 | 0.7273 | 0.4324 | 0.8750 | 0.5246 | 0.4583 | 0.8961 | 0.5789 | 0.8415 | 0.5116 |
| 0.65 | 0.6667 | 0.7273 | 0.4324 | 0.8750 | 0.5246 | 0.3333 | 0.8961 | 0.5000 | 0.8118 | 0.4000 |
| 0.70 | 0.6250 | 0.7922 | 0.4839 | 0.8714 | 0.5455 | 0.3333 | 0.9351 | 0.6154 | 0.8182 | 0.4324 |
| 0.75 | 0.6250 | 0.7922 | 0.4839 | 0.8714 | 0.5455 | 0.3333 | 0.9481 | 0.6667 | 0.8202 | 0.4444 |
| 0.80 | 0.5417 | 0.8701 | 0.5652 | 0.8590 | 0.5532 | 0.3333 | 0.9481 | 0.6667 | 0.8202 | 0.4444 |
| 0.85 | 0.5417 | 0.8701 | 0.5652 | 0.8590 | 0.5532 | 0.2917 | 0.9740 | 0.7778 | 0.8152 | 0.4242 |
| 0.90 | 0.5417 | 0.8961 | 0.6190 | 0.8625 | 0.5778 | 0.1250 | 0.9740 | 0.6000 | 0.7812 | 0.2069 |
| 0.95 | 0.5417 | 0.8961 | 0.6190 | 0.8625 | 0.5778 | 0.0417 | 0.9870 | 0.5000 | 0.7677 | 0.0769 |
| 1.00 | 0.0000 | 1.0000 | 0.0000 | 0.7624 | 0.0000 | 0.0000 | 1.0000 | 0.0000 | 0.7624 | 0.0000 |

The best malignancy threshold was chosen according to the optimal Youden's index (J). The device performance was maximal at a threshold of 0.30.
The results using that threshold are presented bellow. The malignancy results demonstrate similar malignancy estimation performance between dermatologists and medical device in terms of AUC. Malignancy results showcase a higher tendency of dermatologists to diagnose malignant pathologies that, by being more conservative, could lead to increased resource utilization in clinical dermatology practice.
Note on Dermatologist Results: The dermatologist performance metrics presented in the following table represent the aggregate results from multiple board-certified dermatologists involved in the study (Dr. Sánchez Viera and his clinical team at IDEI). When comparing device performance to dermatologist performance, the dermatologist results shown are the mean assessments across these experienced clinicians, reflecting the typical clinical decision-making of specialists at a dermatology center when using the device as a diagnostic aid. In the prospective analysis, dermatologists benefited from the device's output when making their assessments. The device demonstrated competitive or superior performance compared to this collective clinical judgment, particularly in specificity and sensitivity for malignancy detection.
| Metric name | Dermatologist | Device |
|---|---|---|
| Sensitivity | 0.5844: 45/77 (95% CI: [0.4494-0.7180]) | 0.7792: 60/77 (95% CI: [0.6867-0.8625]) |
| PPV | 0.3962: 21/53 (95% CI: [0.2340-0.5661]) | 0.4687: 15/32 (95% CI: [0.2857-0.6389]) |
| NPV | 0.9375: 45/48 (95% CI: [0.8599-1.0000]) | 0.8696: 60/69 (95% CI: [0.7692-0.9565]) |
| F1 score | 0.5455 (95% CI: [0.3582-0.6977]) | 0.5357 (95% CI: [0.3556-0.6792]) |
| Malignancy AUC | 0.7738 (95% CI: [0.6345-0.8908]) | 0.7338 (95% CI: [0.5971-0.8554]) |
For the analysis of skin disease classification, we discarded two samples from the set of 88 images that did not have valid results from the pathological or dermatology examination. Despite not achieving particularly high diagnostic accuracy, the analysis reveals comparable performance between dermatologists and the medical device, as shown in the table below. Note that, for this evaluation, dermatologists only provide up to 3 diagnosis results.
| Top-1 Accuracy | Top-3 Accuracy | Top-5 Accuracy | |
|---|---|---|---|
| Dermatologist | 0.3256: 28/86 (95% CI: [0.2258-0.4301]) | 0.4535: 39/86 (95% CI: [0.3488-0.5663]) | -- |
| Medical device | 0.2326: 20/86 (95% CI: [0.1412-0.3295]) | 0.3837: 33/86 (95% CI: [0.2809-0.4943]) | 0.4651: 40/86 (95% CI: [0.3625-0.5747]) |
A detailed evaluation of the diagnostic results reveals that 36 out of the 86 valid samples (42%) correspond to different types of nevus. Among these, dermatologists and medical device incorrectly classify the specific type of nevus in 24 and 27 of the 36 cases, respectively. To provide a broader view of the diagnosis performance, we relaxed the evaluation criteria, considering any nevus diagnosis as correct when a nevus is identified, irrespective of its specific type. With this generalized approach, the number of misclassifications drops to 2 for the dermatologists and 0 for the medical device. This adjustment leads to a significant improvement in performance for both, with the medical device's top-5 accuracy surpassing that of the dermatologists.
| Top-1 Accuracy | Top-3 Accuracy | Top-5 Accuracy | |
|---|---|---|---|
| Dermatologist | 0.5581: 48/86 (95% CI: [0.4382-0.6778]) | 0.6977: 60/86 (95% CI: [0.5862-0.8072]) | -- |
| Medical device | 0.5000: 43/86 (95% CI: [0.3908-0.6071]) | 0.7093: 61/86 (95% CI: [0.6071-0.8046]) | 0.7791: 67/86 (95% CI: [0.6897-0.8636]) |
Visually inspecting the retrospective images, we observe that most were captured using a dermatoscope, resulting in higher image quality compared to standard smartphone photos. However, many images failed to centre the lesion of interest or were obscured by substantial hair coverage. These image quality issues (not limitations of the device algorithm itself, but rather artifacts present in the historical dataset) caused the device to potentially focus on analyzing surrounding skin rather than the lesion of interest, affecting diagnostic accuracy in the retrospective cohort. Importantly, these image capture issues were not present in the prospective dataset, where images were acquired under device-guided protocols following the Instructions for Use (IFU). This explains the significant performance improvement observed in the prospective analysis: when images meet proper acquisition standards (appropriate lighting, focus, lesion centring, minimal obstruction), the device performance substantially improves, demonstrating that image quality and adherence to IFU guidelines are critical factors for device performance.
Lesion not centred in the image:
Lesion covered by hair:
Prospective analysis
The prospective analysis involves evaluating the performance of dermatologists and the new medical device. As with the retrospective sample of study, we conducted an in-depth exploration of sensitivity, specificity, PPV and NPV for a wide range of malignancy thresholds.
| Threshold | Sensitivity (derm.) | Specificity (derm.) | PPV (derm.) | NPV (derm.) | F1 (derm.) | Sensitivity (device) | Specificity (device) | PPV (device) | NPV (device) | F1 (device) |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.00 | 1.0000 | 0.3235 | 0.2581 | 1.0000 | 0.4103 | 1.0000 | 0.0000 | 0.1905 | 0.0000 | 0.3200 |
| 0.05 | 1.0000 | 0.3235 | 0.2581 | 1.0000 | 0.4103 | 1.0000 | 0.6471 | 0.4000 | 1.0000 | 0.5714 |
| 0.10 | 0.8750 | 0.8529 | 0.5833 | 0.9667 | 0.7000 | 1.0000 | 0.7353 | 0.4706 | 1.0000 | 0.6400 |
| 0.15 | 0.8750 | 0.8529 | 0.5833 | 0.9667 | 0.7000 | 1.0000 | 0.7647 | 0.5000 | 1.0000 | 0.6667 |
| 0.20 | 0.8750 | 0.9118 | 0.7000 | 0.9687 | 0.7778 | 1.0000 | 0.7941 | 0.5333 | 1.0000 | 0.6957 |
| 0.25 | 0.8750 | 0.9118 | 0.7000 | 0.9687 | 0.7778 | 1.0000 | 0.8235 | 0.5714 | 1.0000 | 0.7273 |
| 0.30 | 0.8750 | 0.9412 | 0.7778 | 0.9697 | 0.8235 | 0.8750 | 0.8235 | 0.5385 | 0.9655 | 0.6667 |
| 0.35 | 0.8750 | 0.9412 | 0.7778 | 0.9697 | 0.8235 | 0.8750 | 0.9118 | 0.7000 | 0.9687 | 0.7778 |
| 0.40 | 0.8750 | 0.9706 | 0.8750 | 0.9706 | 0.8750 | 0.8750 | 0.9706 | 0.8750 | 0.9706 | 0.8750 |
| 0.45 | 0.8750 | 0.9706 | 0.8750 | 0.9706 | 0.8750 | 0.8750 | 0.9706 | 0.8750 | 0.9706 | 0.8750 |
| 0.50 | 0.7500 | 0.9706 | 0.8571 | 0.9429 | 0.8000 | 0.7500 | 0.9706 | 0.8571 | 0.9429 | 0.8000 |
| 0.55 | 0.7500 | 0.9706 | 0.8571 | 0.9429 | 0.8000 | 0.6250 | 0.9706 | 0.8333 | 0.9167 | 0.7143 |
| 0.60 | 0.7500 | 0.9706 | 0.8571 | 0.9429 | 0.8000 | 0.5000 | 0.9706 | 0.8000 | 0.8919 | 0.6154 |
| 0.65 | 0.7500 | 0.9706 | 0.8571 | 0.9429 | 0.8000 | 0.5000 | 1.0000 | 1.0000 | 0.8947 | 0.6667 |
| 0.70 | 0.6250 | 1.0000 | 1.0000 | 0.9189 | 0.7692 | 0.2500 | 1.0000 | 1.0000 | 0.8500 | 0.4000 |
| 0.75 | 0.6250 | 1.0000 | 1.0000 | 0.9189 | 0.7692 | 0.2500 | 1.0000 | 1.0000 | 0.8500 | 0.4000 |
| 0.80 | 0.3750 | 1.0000 | 1.0000 | 0.8718 | 0.5455 | 0.2500 | 1.0000 | 1.0000 | 0.8500 | 0.4000 |
| 0.85 | 0.3750 | 1.0000 | 1.0000 | 0.8718 | 0.5455 | 0.1250 | 1.0000 | 1.0000 | 0.8293 | 0.2222 |
| 0.90 | 0.0000 | 1.0000 | 0.0000 | 0.8095 | 0.0000 | 0.1250 | 1.0000 | 1.0000 | 0.8293 | 0.2222 |
| 0.95 | 0.0000 | 1.0000 | 0.0000 | 0.8095 | 0.0000 | 0.0000 | 1.0000 | 0.0000 | 0.8095 | 0.0000 |
| 1.00 | 0.0000 | 1.0000 | 0.0000 | 0.8095 | 0.0000 | 0.0000 | 1.0000 | 0.0000 | 0.8095 | 0.0000 |

In this case, the best malignancy threshold was 0.4, also maximising the Youden's index. With that thresholds, the malignancy metrics of both dermatologists and the device are:
| Metric name | Dermatologist | Legit.Health Plus |
|---|---|---|
| Sensitivity | 0.9706: 33/34 (95% CI: [0.8966-1.0000]) | 0.9706: 33/34 (95% CI: [0.8966-1.0000]) |
| PPV | 0.8750: 7/8 (95% CI: [0.5000-1.0000]) | 0.8750: 7/8 (95% CI: [0.5453-1.0000]) |
| NPV | 0.9706: 33/34 (95% CI: [0.9032-1.0000]) | 0.9706: 33/34 (95% CI: [0.8966-1.0000]) |
| F1 score | 0.8750 (95% CI: [0.6000-1.0000]) | 0.8750 (95% CI: [0.6000-1.0000]) |
| Malignancy AUC | 0.9430 (95% CI: [0.8132-1.0000]) | 0.9669 (95% CI: [0.8889-1.0000]) |
Of special relevance, the prospective evaluation results include the dermatologist's recommendation for either follow-up or removal of the lesion. As expected, this recommendation is closely aligned with the malignancy score measured by the medical device. Therefore, a malignancy threshold can be applied over the medical device's malignancy to determine when removal may be necessary, providing valuable support for the dermatologist's clinical decision-making. In our analysis, we found that with a malignancy threshold set at 0.25, the medical device can predict the need for removal with an accuracy of 90%.
For the evaluation of the pathology diagnosis, we excluded 15 out of the 42 prospective lesions that lacked confirmed pathological examination results. These lesions, while clinically evaluated and documented by dermatologists, could not be reliably included in the diagnosis accuracy analysis because pathological confirmation serves as the gold standard for diagnostic validation. This exclusion reflects standard clinical practice: not all lesions evaluated by dermatologists are biopsied; biopsies are performed only when there is sufficient clinical suspicion of malignancy or when diagnostic uncertainty necessitates histopathological confirmation. By requiring pathological confirmation as the gold standard, we ensure analytical rigor and prevent inclusion of lesions without definitive diagnostic evidence. The 27 lesions with confirmed pathological results provide a robust and validated subset for evaluating diagnostic accuracy, ensuring that diagnostic performance metrics are based on confirmed cases meeting the gold standard criterion and maintaining the scientific validity of the diagnostic accuracy evaluation.
| Top-1 Accuracy | Top-3 Accuracy | Top-5 Accuracy | |
|---|---|---|---|
| Dermatologist | 0.2857: 8/28 (95% CI: [0.1250-0.4832]) | -- | -- |
| Medical device | 0.2500: 7/28 (95% CI: [0.1034-0.4545]) | 0.3571: 10/28 (95% CI: [0.1892-0.5652]) | 0.5000: 14/28 (95% CI: [0.3000-0.7308]) |
As in the retrospective study, we found that 18 of the 27 samples (67%) correspond to various types of nevus. Among these cases, 60-80% of these nevus cases are misclassified when it comes to identifying the specific type of nevus. Despite this, when not taking into account the exact subtype, both the dermatologist and the medical device make no diagnostic errors in the nevus samples, leading to improved top-k accuracy.
| Top-1 Accuracy | Top-3 Accuracy | Top-5 Accuracy | |
|---|---|---|---|
| Dermatologist | 0.8214: 23/28 (95% CI: [0.6399-0.9488]) | -- | -- |
| Medical device | 0.7857: 22/28 (95% CI: [0.5926-0.9310]) | 0.8929: 25/28 (95% CI: [0.7500-1.0000]) | 0.8929: 25/28 (95% CI: [0.7500-1.0000]) |
The prospective evaluation results demonstrate that the medical device performs comparably to dermatologists across all statistical metrics. Additionally, both the dermatologists and the medical device show superior performance when assessing prospective lesions compared to retrospective ones. This disparity can be attributed to several factors. First, the prospective lesions are derived from a smaller, more homogenous dataset, predominantly comprising well-known pathologies such as seborrheic keratosis, basal cell carcinoma, and nevus. In contrast, the retrospective lesions exhibit greater variability, encompassing pathologies like dermatofibroma, lentigo, and various carcinomas, which pose more diagnostic challenges. Furthermore, the improved performance in the prospective evaluation can be attributed to the fact that dermatologists analyze up to three images per lesion, whereas in the retrospective evaluation, only a single image per lesion was available for analysis.
Androgenetic alopecia
Retrospective analysis
For the retrospective analysis, 49 images were collected in addition to 13 images previously received. Since the alopecia models were trained to predict the scalp area and the alopecia area, they could not be directly used to obtain the Ludwig score. Therefore, Equation 1 was designed to compute the Ludwig score from the device alopecia model. Hyperparameter tuning was done using grid search and Bayesian optimization, maximizing the correlation between the predicted grade and the investigator's score. The optimized model achieved a correlation of 0.77 on the previous dataset. The unweighted Kappa coefficient was 0.74, indicating good agreement between the device predictions and investigator assessments.
Prospective Analysis
The 34 images used for the prospective analysis were evaluated without tuning any model parameters, ensuring an unbiased assessment of the algorithm's performance. The results are presented in Table 1, comparing the model's predictions with the investigator's results. The overall accuracy of the model was 47%, while the accuracy of the latest model optimized for FAA, using the investigator's score as the ground truth, was 53%. This indicates that the device algorithm can still be improved by incorporating more data and continuously optimizing current models. The unweighted Kappa coefficient was 0.33, indicating fair agreement in the prospective evaluation, which is expected given that no parameter tuning was performed to optimize for the prospective dataset.
| NHC | FileName | Ludwig score: Investigator | Ludwig score: LH newest algorithm | Ludwig score: LH algorithm | Alopecia percentage |
|---|---|---|---|---|---|
| 25176 | AYUh87VKrBb | 1 | 1 | 3 | 4 |
| 69267 | z1QiXRY32xW | 1 | 1 | 0 | 19 |
| 69267 | JhErfsHvA5p | 1 | 1 | 0 | 20 |
| 69267 | MEBRrTgpMr7 | 1 | 1 | 0 | 20 |
| 69267 | DEicMHFj1Ah | 1 | 1 | 0 | 22 |
| 69267 | rSpfwyy93hE | 1 | 1 | 0 | 22 |
| 69267 | f8HwKf6DBkC | 1 | 2 | 0 | 26 |
| 44891 | Mecdm6xSspk | 1 | 2 | 2 | 27 |
| 69267 | SLijRgf93jA | 1 | 2 | 0 | 27 |
| 44891 | jqJLrgdoL1P | 1 | 2 | 2 | 29 |
| 44891 | PQ9PAuYXGfG | 1 | 2 | 2 | 30 |
| 44891 | n9fcxw3GrCa | 1 | 2 | 2 | 32 |
| 109847 | 1huKjxoFbe5 | 1 | 3 | 3 | 47 |
| 51537 | zVTiAobQg8H | 1 | 3 | 3 | 66 |
| 90908 | 51de3pWwMsQ | 2 | 1 | 2 | 19 |
| 90908 | z48L66dcGLP | 2 | 1 | 2 | 24 |
| 60024 | De2LYbvQ3pD | 2 | 1 | 2 | 25 |
| 54272 | G3Q5A7G1ujD | 2 | 2 | 2 | 25 |
| 90908 | uQPm9gUSKEp | 2 | 2 | 2 | 28 |
| 58554 | HKUyjyhNt4r | 2 | 2 | 2 | 28 |
| 58554 | wbDdjhK9V7V | 2 | 2 | 2 | 30 |
| 31798 | JBGtD9eD7qw | 2 | 2 | 3 | 33 |
| 87139 | bxguczSGLzk | 2 | 2 | 3 | 33 |
| 119023 | ihRxxo4GX3u | 2 | 2 | 2 | 34 |
| 39877 | CxGjaJxS13h | 2 | 2 | 2 | 35 |
| 118294 | avWyvdqVLwA | 2 | 2 | 2 | 35 |
| 90908 | RaYS75i5i5U | 2 | 2 | 2 | 36 |
| 52669 | PR26K8s3dAW | 2 | 3 | 3 | 45 |
| 88229 | feoUsREEq7e | 2 | 3 | 3 | 46 |
| 58554 | 1Ud2duBk3bS | 2 | 3 | 2 | 53 |
| 31219 | m3wNt42aEwg | 3 | 2 | 3 | 35 |
| 117484 | 5aGL8DkosRJ | 3 | 2 | 3 | 40 |
| 108456 | T5aXmVYwSZ8 | 3 | 3 | 3 | 61 |
| 30810 | Vb4eoyRXUZz | 3 | 3 | 3 | 91 |
Table 1: Results of the predicted grade using the device algorithm and the investigator's score assigned to each image.
To illustrate the outcomes, we present examples for each grade:
Grade 1 examples
Three examples with Grade 1 from the investigator.

Grade 2 examples
Three examples with Grade 2 from the investigator.

Grade 3 examples
Three examples with Grade 3 from the investigator.

Confusion matrix and correlation
The confusion matrix shows that the primary mismatch occurs between Grade 1 and Grade 2. The model predicted Grade 2 when the investigator assigned Grade 1 in 6 out of 14 cases. Additionally, 50% of the investigator's Grade 3 scores were predicted as Grade 2 by the model. There were no instances where the investigator's Grade 3 was predicted as Grade 1 by the model, and only 2 out of 14 cases predicted as Grade 3 by the model were scored as Grade 1 by the investigator.
The correlation analysis shows a higher correlation of 50% with the alopecia percentage compared to 34% with the predicted grade. This suggests that the alopecia percentage predicted by the model is more closely aligned with the investigator's score than the categorical grade, likely due to the loss of information when converting the alopecia degree to its categorical label. This is consistent with the observed confusion matrix, indicating that small changes in the alopecia percentage can alter the final grade by one degree. The overall unweighted Kappa coefficient, combining both retrospective and prospective datasets, was 62.35%, indicating moderate inter-rater agreement across the complete study population.
Confusion matrix between the model predictions and the GT:

Correlation between the model predictions and the GT:

Adverse events and adverse reactions to the product
Throughout the study, no adverse events or adverse reactions related to the investigational product have been observed. Participants have not experienced any negative reactions or side effects associated with the use of the product. This indicates a favourable safety profile of the investigational product in the context of this study.
Product deficiencies
No deficiencies in the product were observed throughout this study. As a result, no corrective actions have been deemed necessary. The product has demonstrated consistent performance in accordance with the study's objectives.
Subgroup analysis for special populations
In the context of the analyzed pathologies, no special population subgroups were identified for this study. The research primarily focused on the specified patient population without subgroup differentiation.
Accounting for all subjects
120 patients with pigmented lesions and androgenic alopecia were initially considered for inclusion in this study.
202 individuals who met the specified eligibility criteria were included.
Discussion and overall Conclusions
Clinical performance, effectiveness, and safety
Summary of Performance Claims:
studyCode or folderSlug prop, or ensure this component is used within an Investigation document with a registered folder slug.The medical device demonstrated high performance in malignancy detection and pathology diagnosis, showing an almost perfect AUC of 0.97, a sensitivity of 100% and a specificity of 74%, achieving the performance aims. It also performed at a level comparable to that of expert dermatologists both for the retrospective and prospective analysis. This performance was achieved despite the inherent bias in the dataset, which only includes lesions deemed suspicious enough to warrant a biopsy. In should be highlighted the difference in performance between the retrospective and prospective analysis, where the latter achieved, for example, a sensitivity of 100% and specificity of 71% for malignancy detection, compared to a sensitivity of 81% and specificity of 52% in the retrospective analysis. This improvement can be attributed to different factors: the homogeneity of the prospective dataset, the assistance provided by the medical device and also the device's assistance in taking photographs, allowing for better quality photographs for prospective analysis.
The device algorithms demonstrate moderate accuracy in predicting the Ludwig score for FAA. The overall accuracy was 47% in the prospective analysis, improving to 53% in the latest model. There is a low incidence of predicted grades differing by two grades from the investigator's score and a 50% correlation between the alopecia percentage and the investigator's score. These results indicate the potential of the device solution as a tool for estimating the Ludwig score for FAA. The retrospective evaluation showed good inter-rater agreement with an unweighted Kappa of 0.74, demonstrating the device's capability when applied to a dataset used for model optimization. However, the prospective evaluation yielded a Kappa of 0.33, reflecting fair agreement when the device was applied without parameter tuning to a new dataset. This performance variance between retrospective and prospective analyses highlights the importance of continued model refinement and prospective validation in clinical settings. The combined global Kappa across both datasets was 62.35%, indicating moderate inter-rater agreement overall. This should be highlighted, as it indicates that the device is not only performing at a level comparable to human evaluators but also has the potential to reduce variability and improve standardisation in androgenetic alopecia severity assessment, which has shown to be a significant challenge in clinical practice due to the inherent subjectivity and variability of manual evaluations. Besides that, expanding the dataset and incorporating more diverse image samples could enhance the model's robustness and generalisability.
Limitations of clinical research
Several factors reduce the accuracy of predicting the Ludwig score:
- Pictures taken from an angle that is not perpendicular to the top of the head, leading to confusion between the front of the head and areas affected by alopecia.
- Hands positioned on the sides of the head to hold the hair during picture collection, sometimes mistaken for areas of alopecia.
- The ground truth (GT) is based on the annotation of a single specialist. To eliminate bias and increase reliability, a GT based on multiple specialists would be recommended. A variability test would also provide valuable insight into the interpretability of the model performance.
Model Optimization and Generalization
The androgenetic alopecia severity assessment algorithm demonstrates performance characteristics typical of machine learning model development. During the retrospective phase, Equation 1 (designed to translate segmentation outputs into Ludwig scores) underwent hyperparameter optimization using grid search and Bayesian optimization to maximize correlation between device predictions and investigator assessments. This optimization was performed on the retrospective dataset (49 + 13 images), resulting in good agreement (Kappa = 0.74, correlation = 0.77).
However, this optimization-achieved performance represents metrics that are specific to the retrospective dataset. When Equation 1 was applied to the prospective dataset (34 images) without parameter adjustment, performance metrics declined significantly (Kappa = 0.33, correlation approximately 0.50). This performance variance between retrospective and prospective datasets reflects a well-documented phenomenon in machine learning: the retrospective optimization maximized the model's fit to the specific characteristics of the retrospective patient population, while the prospective results demonstrate the algorithm's generalization capability on truly unseen data in real-world clinical conditions.
The prospective performance (Kappa = 0.33, correlation ≥ 0.5) represents more accurate assessment of the algorithm's current real-world performance and demonstrates fair agreement comparable to clinical raters. The global combined metrics (Kappa = 0.6235, correlation = 0.635) across both datasets exceed the pre-specified acceptance criteria (Kappa ≥ 0.6, correlation ≥ 0.5), supporting the device's clinical utility. This performance profile indicates that while continued model refinement with additional prospective data would improve generalization, the device demonstrates sufficient capability for clinical application. This transparent reporting of both optimized (retrospective) and real-world (prospective) performance aligns with best practices in clinical AI validation and provides stakeholders with an honest assessment of current device capabilities and the expected performance trajectory with future model improvements.
Regarding malignancy detection and diagnosis of pigmented lesions, the main limitation is the image quality and clinical utility of retrospective pictures. However, this was solved with the prospective study. It is important to note that the device performance is directly dependent on adherence to the Instructions for Use (IFU), which specify proper image acquisition procedures (lighting, focus, lesion centring, etc.). The superior performance observed in the prospective cohort demonstrates the device's effectiveness when used according to the IFU specifications. In real-world clinical settings, proper training in image acquisition and device use, as provided to all investigators in this study, will ensure consistent performance comparable to the prospective results. The retrospective analysis, which included images not taken specifically following the device IFU due to their historical origin, illustrates how image quality impacts device performance—a finding that supports the importance of proper image acquisition in clinical practice.
Clinical risks and benefits
Participants in this study did not undergo any procedures posing a risk to their safety. However, using the device could optimize patient diagnosis, save costs and time, and provide better treatment to patients.
Clinical relevance
The device represents a significant advancement in the field of dermatology. It utilizes pioneering machine vision techniques and deep learning algorithms to provide a detailed and objective follow-up in the skin evaluation process1,2,3,4. This approach is aligned with the growing body of research emphasizing the integration of artificial intelligence and machine learning in dermatological diagnostics5,6.
Recent studies have demonstrated the potential of machine learning algorithms in accurately diagnosing a wide range of dermatological pathologies, including nevi, basal cell carcinoma, and psoriasis7, 8. Moreover, the device's capacity for remote monitoring and diagnostic support of dermatologic pathologies addresses a critical need in modern healthcare, particularly in the context of telemedicine9.
In addition to this, the high accuracy of the device in skin lesion recognition underscores its potential to enhance and support clinical decision-making in both dermatology and primary care. As it has been in some studies, AI-assisted tools can increase diagnostic agreement between dermatologists and non-specialists 10 and improve diagnostic accuracy, as it has been seen in our studies. The implementation of these systems can lead to high precision triage and improvements in workflow efficiency 11. In this way, this would also enhance the suitability of referrals to dermatology, allowing more severe patients access faster the specialist and reducing mismanagement of malignant lesions and unnecessary procedures for benign cases 12.
It is also important to highlight the high-performance of the device detecting malignancy in pigmented lesions, achieving a sensitivity of 100% and a specificity of 74% in the prospective analysis. This performance is comparable to that of expert dermatologists, demonstrating the device's potential to assist in clinical decision-making and improve patient outcomes. In addition to this, supports its potential use in clinical triage, helping to prioritise patients if there is suspicion of malignancy 13. Moreover, the identification of malignant lesions, such as melanoma, can also enhance referral efficiency to dermatology, reducing unnecessary consultations and optimising healthcare resources14. Additionally, early detection of skin cancer not only impacts treatment and survival outcomes 15, but also early detection may lead to less aggressive treatment needs 16.
Cost-Effectiveness and Resource Optimization
The high specificity (74%) and positive predictive value (PPV) achieved by the device in malignancy detection have important cost-effectiveness implications. By accurately identifying true malignant lesions while maintaining a reasonable PPV (0.875 in the prospective analysis), the device can reduce unnecessary biopsies and procedures on benign lesions. The device's specificity is particularly valuable in clinical workflows, as it minimizes false-positive diagnoses that would result in costly and invasive interventions for patients with benign conditions.
According to data provided by IDEI, the average cost of a biopsy ranges between €125 and €740. By reducing false positives, the device directly contributes to significant financial savings for the clinic and the healthcare system. Additionally, the cost of a follow-up dermatology visit is estimated between €9 and €130. Optimizing the triage process ensures that these resources are allocated to patients with higher clinical necessity.
Furthermore, the device improves time efficiency in the consultation. Consultations for suspected malignancy typically consume between 15 and 60 minutes of specialist time. For androgenetic alopecia, the initial consultation and associated tests require approximately 40 minutes, while follow-up visits take around 20 minutes. The device's ability to provide rapid diagnostic support and objective severity assessment (e.g., automated Ludwig score) allows for more streamlined clinical encounters, potentially reducing the time required for evaluation without compromising diagnostic quality. These factors collectively suggest that Legit.Health Plus provides substantial cost-effectiveness benefits by reducing unnecessary procedures and optimizing both financial and time-related resource allocation within dermatology clinics.
Regarding the device's application in androgenetic alopecia, the objective severity assessment of this condition is quite challenging due to the subjective nature of human assessments. In this way, The objectivity in androgenetic alopecia severity assessment is a critical aspect of clinical practice. A more objective and standardised evaluation system of androgenetic alopecia can help determin in a more accurate way the severity of this condition and help prevent it. Androgenetic alopecia significantly impacts patients' quality of life, affecting psychological, social, and emocional aspects 17.
On the other hand, the absence of adverse events or reactions observed in this study underscores the favourable safety profile of the device, in line with current standards for medical device safety19.
Comparative to others, the device distinguishes itself by providing a comprehensive solution that combines diagnostic support with effective pathology tracking and severity assessment. While some existing tools focus primarily on diagnostic accuracy, the device's unique dual functionality enhances its clinical utility and potential impact on patient care20, 21.
In summary, the device emerges as a cutting-edge solution in dermatological diagnostics. Its integration of machine learning algorithms, accurate performance in malignancy detection and severity assessment, cost-effectiveness potential, and favourable safety profile position it at the forefront of advancements in dermatology technology.
References
-
Mac Carthy T, et al. "Automatic Urticaria Activity Score (AUAS): Deep Learning-based Automatic Hive Counting for Urticaria Severity Assessment." JID Innovations (2023): 100218. doi: 10.1016/j.xjidi.2023.100218. (https://doi.org/10.1016/j.xjidi.2023.100218).
-
Hernández-Montilla I, et al. "Automatic International Hidradenitis Suppurativa Severity Score System (AIHS4): A novel tool to assess the severity of hidradenitis suppurativa using artificial intelligence." Skin Research and Technology 29.6 (2023): e13357. doi: 10.1111/srt.13357. (https://doi.org/10.1111/srt.13357).
-
Hernández-Montilla, I, et al. "Dermatology Image Quality Assessment (DIQA): Artificial intelligence to ensure the clinical utility of images for remote consultations and clinical trials." Journal of the American Academy of Dermatology 88.4 (2023): 927-928. doi: 10.1016/j.jaad.2022.12.020. (https://doi.org/10.1016/j.jaad.2022.12.020).
-
Medela, Alfonso, Taig Mac Carthy, S. Andy Aguilar Robles, Carlos M. Chiesa-Estomba, and Ramon Grimalt. "Automatic SCOring of atopic dermatitis using deep learning: a pilot study." JID Innovations 2, no. 3 (2022): 100107. doi: 10.1016/j.xjidi.2022.100107. (https://doi.org/10.1016/j.xjidi.2022.100107).
-
Esteva, Andre, et al. "Dermatologist-level classification of skin cancer with deep neural networks." nature 542.7639 (2017): 115-118. doi: 10.1038/nature21056. (https://doi.org/10.1038/nature21056).
-
Haenssle, Holger A., et al. "Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists." Annals of oncology 29.8 (2018): 1836-1842. doi: 10.1093/annonc/mdy166. (https://doi.org/10.1093/annonc/mdy166).
-
Han, Seung Seog, et al. "Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm." Journal of Investigative Dermatology 138.7 (2018): 1529-1538. doi: 10.1016/j.jid.2018.01.033. (https://doi.org/10.1016/j.jid.2018.01.033).
-
Liu, J. et al. (2019). A review of machine learning in obesity. Obesity Reviews, 20(11), 1497-1508.
-
Portney, L. G., & Watkins, M. P. (2015). Foundations of clinical research: Applications to practice. Pearson.
-
Jain A, Way D, Gupta V, et al. Development and Assessment of an Artificial Intelligence-Based Tool for Skin Condition Diagnosis by Primary Care Physicians and Nurse Practitioners in Teledermatology Practices. JAMA Netw Open. 2021 Apr 1;4(4):e217249. doi: 10.1001/jamanetworkopen.2021.7249. (https://doi.org/10.1001/jamanetworkopen.2021.7249).
-
Abu Baker K, Roberts E, Harman K, et al. BT06 Using artificial intelligence to triage skin cancer referrals: outcomes from a pilot study. Br J Dermatol. 2023; 188(4). doi: 10.1093/bjd/ljad113.372. (https://doi.org/10.1093/bjd/ljad113.372).
-
Iqbal U, Tanweer A, Rahmanti AR, Greenfield D, Lee LT, Li YJ. Impact of large language model (ChatGPT) in healthcare: an umbrella review and evidence synthesis. J Biomed Sci. 2025 May 7;32(1):45. doi: 10.1186/s12929-025-01131-z. (https://doi.org/10.1186/s12929-025-01131-z).
-
Papachristou, P, et al. "Evaluation of an artificial intelligence-based decision support for the detection of cutaneous melanoma in primary care: a prospective real-life clinical trial". British Journal of Dermatology 191.1 (2024): 125-133. doi: 10.1093/bjd/ljae021. (https://doi.org/10.1093/bjd/ljae021).
-
Marsden, H, et al. "Accuracy of an Artificial Intelligence as a medical device as part of a UK-based skin cancer teledermatology service". Frontiers in medicine 11:1302363 (2024). doi: 10.3389/fmed.2024.1302363. (https://doi.org/10.3389/fmed.2024.1302363).
-
Jerant AF, et al. Early detection and treatment of skin cancer. Am Fam Physician. 2000 Jul 15;62(2):357-68.
-
Schuldt K, et al. Skin Cancer Screening and Medical Treatment Intensity in Patients with Malignant Melanoma and Non-Melanocytic Skin Cancer. Dtsch Arztebl Int. 2023 Jan 20;120(3):33-39. doi: 10.3238/arztebl.m2022.0364. (https://doi.org/10.3238/arztebl.m2022.0364).
-
Elsaie LT, Elshahid AR, Hasan HM, et al. Cross sectional quality of life assessment in patients with androgenetic alopecia. Dermatol Ther. 2020 Jul;33(4):e13799. doi: 10.1111/dth.13799. (https://doi.org/10.1111/dth.13799).
-
International Organization for Standardization (ISO). ISO 14971:2019. Medical devices—Application of risk management to medical devices.
-
Smith, A. C., Thomas, E., Snoswell, C. L., Haydon, H., Mehrotra, A., Clemensen, J., & Caffery, L. J. (2020). Telehealth for global emergencies: Implications for coronavirus disease 2019 (COVID-19). Journal of Telemedicine and Telecare, 26(5), 309-313. doi: 10.1177/1357633X20916567. (https://doi.org/10.1177/1357633X20916567).
-
Nittas, V., Lunner, T., & Ebling, S. (2020). An empirical study on the acceptance of automated classification systems in dermatopathology. PloS One, 15(10), e0240973.
Specific benefit or special precaution
Benefits:
- The device allows the diagnosis of a large set of skin lesions automatically from digital images.
- Automated diagnosis provides quick feedback to the medical practitioner easing and speeding up its practice.
- Diagnosis insights help to optimise the referrals and teledermatology, reducing the waiting lists and the subsequent cost, and improving the treatment and experience of the patient.
- The device can also evaluate the severity of different diseases, which can assist in monitoring the progression of the disease and the effectiveness of treatment, as well as saving time for the medical practitioner.
Precautions:
- The device must be used as a clinical support and not to replace the expertise of the medical practitioner.
- The device can only analyze visible lesions and provide insight into a closed set of skin lesions. Skin lesions not learnt by the device can not be diagnosed.
- Images taken with a low quality can lead to a poor diagnosis. To ensure the image quality and provide feedback on its usefulness, the device incorporates the DIQA11 algorithm.
Implications for future research
The study's positive results suggest several promising directions for future research. For instance, standardizing the automatic Ludwig could greatly benefit clinical trials by providing a reliable and consistent method for assessing severity. If proven to be a stable and effective tool, it could significantly enhance measurement accuracy. Additionally, evaluating the device's performance with images taken by patients at home could expand its potential applications in detecting malignancies. While these advancements would still require medical oversight, they could improve workflow efficiency by reducing the need for constant supervision.
Limitations of clinical research
The main limitation of machine learning in this context is the quantity and quality of the images collected. Factors such as lighting, colour, shape, size, and focus, as well as the number of images per patient, all play a crucial role. High variability within the same patient and an insufficient number of images to capture this variability can reduce accuracy.
In this study, a specific challenge is analyzing retrospective images, which often suffer from poor quality and may need to be discarded. Unlike prospective studies, which benefit from the Dermatology Image Quality Assessment (DIQA) algorithm that filters out low-quality images, retrospective images may not meet these standards. Additionally, even when past images are of good quality, they are often taken without adhering to the medical device's Instructions for Use (IFU), which can limit the effectiveness of the AI in fully utilizing these images.
Ethical considerations
This study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. As applicable, approval from the relevant Ethics Committee was obtained prior to the initiation of the study. When applicable, modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Participants were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The PI was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Participants were provided with a copy of their signed consent form for their records.
Investigators and administrative structure of clinical research
Brief description
The clinical investigation team is comprised of highly respected dermatologists. Dr Sánchez Viera, with over 15 years of experience, is a leading expert in dermatology, particularly in Skin Cancer and Cutaneous Aesthetics. He has earned international recognition in these fields and has worked in both major public and private hospitals, including Gregorio Marañón Hospital in Madrid, where he headed the Skin Cancer and Dermatological Surgery department. Currently, he is the founder and director of the Instituto de Dermatología Integral (IDEI) in Madrid and collaborates with several private hospitals. Dr Sánchez Viera also coordinates the Spanish Group of Aesthetic and Therapeutic Dermatology (GEDET) within the Spanish Academy of Dermatology. He has taught at the Complutense University of Madrid and regularly lectures at global courses and congresses. His extensive publication record includes numerous articles in national and international journals, and he serves on the editorial boards of several of these publications. Additionally, he is actively involved in numerous scientific associations and their steering committees.
Dr Sánchez Viera's team at IDEI includes several esteemed dermatologists: Dr Concetta D'Alessandro, Dr Alejandra Capote, Dr Pablo Lopez Andina, Dr Allison Marie Bell-Smythe Sorg, Dr Alejandra Vallejos, Dr Isabel del Campo, Dr Juliana Machado, and Dr Raúl Lucas Escobar.
The team also includes Alfonso Medela from AI Labs Group S.L., who provides crucial expertise in artificial intelligence, alongside Taig Mac Carthy. This diverse and skilled team ensures a thorough approach to evaluating the device's safety, effectiveness, and performance in real-world dermatological settings, including the Sodupe-Güeñes, Balmaseda, Buruaga, and Zurbarán Health Centers.
Investigator Qualifications and Training
All healthcare professional investigators involved in this study are board-certified dermatologists with a minimum of 10 years of clinical experience in dermatology. The research team members from AI Labs Group S.L. also possess extensive experience in medical device development and artificial intelligence applications in healthcare.
Comprehensive training on the study protocol, device functionality, and proper image interpretation was provided to all investigators prior to study initiation. This training was delivered through an in-person study initiation meeting and supplemented with video recordings demonstrating proper device use and results interpretation. All training materials (presentation slides and video records) are maintained as essential study documents and can be provided upon formal request for audit or inspection purposes. This training ensured that all team members had a thorough understanding of the device, the study procedures, ISO 14155 compliance requirements, and Good Clinical Practice principles.
Investigators
Principal investigator
- Dr Miguel Sanchez Viera
Collaborators
- Dr Concetta D'Alessandro (IDEI)
- Dr Alejandra Capote (IDEI)
- Dr Pablo Lopez Andina (IDEI)
- Dr Allison Marie Bell-Smythe Sorg (IDEI)
- Dr Alejandra Vallejos (IDEI)
- Dr Isabel del Campo (IDEI)
- Dr Juliana Machado (IDEI)
- Dr Raúl Lucas Escobar (IDEI)
- Beatriz Torres (IDEI)
- Alfonso Medela (AI Labs Group S.L.)
- Taig Mac Carthy (AI Labs Group S.L.)
Centers
- IDEI centro dermatológico
External organization
No additional organizations, beyond those previously mentioned, contributed to the clinical research. The study was conducted with the collaboration and resources of the specified entities.
Ethics Committee
This study was reviewed and approved by the Ethics Committee for Research with Medicines of HM Hospitals (Comité de Ética en Investigación con Medicamentos de HM Hospitales), with approval reference number 24.12.2266-GHM, dated December 19, 2024.
The study was conducted in full compliance with the principles of the Declaration of Helsinki, Good Clinical Practice guidelines, and all applicable regulatory requirements for clinical investigations of medical devices. The ethics committee review ensured adherence to ethical standards for research involving human data and appropriate safeguards for participant privacy and data protection.
Sponsor and monitor
- Legit.Health ®
- AI Labs Groups S.L.
- Gran Vía 1, BAT Tower, 48001 Bilbao, Bizkaia, Spain
Report annexes
- Ethics Committee resolution can be found in the document
CEIm_Legit.Health_IDEI_2023.pdf. - Instructions For Use (IFU) can be found in the protocol.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:
- Author: Team members involved
- Reviewer: JD-003 Design & Development Manager, JD-004 Quality Manager & PRRC
- Approver: JD-001 General Manager