R-TF-015-006 Clinical investigation report
Research Title
Simulated-use multi-reader multi-case (MRMC) investigation of Legit.Health Plus (hereinafter "the device") as a diagnostic decision-support tool: effect on healthcare practitioners' top-1 diagnostic accuracy on a curated image set representative of Fitzpatrick phototype V and VI presentations.
Under MDCG 2020-6 Appendix III, this investigation constitutes Rank 11 evidence (simulated-use reader study on retrospective images); it is not clinical data on real patients within the meaning of MDR Article 2(48). Per MDCG 2020-1 §4.4, it contributes Pillar 3 Clinical Performance supporting evidence at Rank 11 — measuring the clinician's diagnostic decision-making when using the device's Top-5 prioritised differential view on Fitzpatrick V and VI presentations, positioned below the Rank 2–4 prospective real-patient studies that carry the primary Pillar 3 weight. The positioning of this investigation within the Clinical Evaluation Plan and Clinical Evaluation Report, and the bridging to real-world evidence required to support clinical-outcome claims, is addressed at CER level.
Product Identification
| Information | |
|---|---|
| Device name | Legit.Health Plus (hereinafter, the device) |
| Model and type | NA |
| Version | 1.1.0.0 |
| Basic UDI-DI | 8437025550LegitCADx6X |
| Certificate number (if available) | MDR 000000 (Pending) |
| EMDN code(s) | Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software) |
| GMDN code | 65975 |
| EU MDR 2017/745 | Class IIb |
| EU MDR Classification rule | Rule 11 |
| Novel product (True/False) | TRUE |
| Novel related clinical procedure (True/False) | TRUE |
| SRN | ES-MF-000025345 |
Throughout this document, references to "the device" refer to the investigational product identified above.
Device version under investigation and bridging to the CE-marked release
The investigation was conducted using device version v1.1.0.0, which is the only version placed on the market and the version to which the present technical documentation applies. No intermediate development build was used during the conduct of the investigation, and no changes to the algorithm, model checkpoint, user interface, indications or claims occurred between the version evaluated in this investigation and the version submitted for CE marking. The bridging between the investigation-version and the CE-marked release is therefore an identity bridge; no clinical-relevance assessment of inter-version differences is required. The device-version statement has been reviewed and signed off by the PRRC.
Sponsor Identification and Contact
| Manufacturer data | |
|---|---|
| Legal manufacturer name | AI Labs Group S.L. |
| Address | Street Gran Vía 1, BAT Tower, 48001, Bilbao, Bizkaia (Spain) |
| SRN | ES-MF-000025345 |
| Person responsible for regulatory compliance | Alfonso Medela, Saray Ugidos |
| office@legit.health | |
| Phone | +34 638127476 |
| Trademark | Legit.Health |
| Authorized Representative | Not applicable (manufacturer is based in EU) |
Identification of the Clinical Investigation Plan (CIP)
| CIP | |
|---|---|
| Title of the clinical investigation | Multi-Reader Multi-Case Study for Evaluating the Diagnostic Performance of Healthcare Professionals Assisted by Legit.Health Plus on Fitzpatrick Phototype V–VI Skin Presentations |
| Device under investigation | Legit.Health Plus |
| Protocol version | Version 1.0 |
| Date | 2026-01-21 |
| Protocol code | LEGIT.HEALTH_MAN_2025 |
| Sponsor | AI Labs Group S.L. |
| Coordinating Investigator | Dr. Antonio Martorell Calatayud |
| Principal Investigator(s) | Dr. Antonio Martorell Calatayud |
| Investigational site(s) | This study is conducted remotely through a centralized web-based platform. |
| Ethics Committee | This study does not require Ethics Committee approval because it is observational and non-interventional. All data used consists of fully anonymized images sourced from public dermatology atlases and databases, containing no information permitting patient identification. As such, the research meets the criteria for exemption from ethics committee review under applicable regulatory frameworks. |
Trial Registrations
This investigation is not registered in ClinicalTrials.gov or the EMA RWD Catalogue (EUPAS). Registration is not required because this investigation does not meet the definition of a clinical investigation under MDR Article 2(45): no patients are involved, no clinical interventions are performed and no clinical decisions are influenced by the investigation. A voluntary post-hoc registration may be performed for transparency purposes.
Public Access Database
A public-facing results summary is not published through a trial registry for this investigation (see §Trial Registrations). The underlying image set and reader-level raw data are not publicly shared due to privacy and confidentiality considerations. The de-identified dataset backing the numerical tables in this report is held by the manufacturer within the QMS and is available for audit on request.
Research Team
Principal investigator
- Dr. Antonio Martorell Calatayud
Collaborators
- Medical staff — participating healthcare professionals are identified in this report by anonymised reader codes (
R-01,R-02, …). The code-to-identity master list, together with reader CVs, certification evidence and signed participation agreements, is maintained by the Principal Investigator under restricted access and is available for audit on request. - Manufacturer
- Mr. Taig Mac Carthy (Regulatory and Quality, manufacturer)
- Mr. Alfonso Medela (Chief Scientific Officer, manufacturer)
Centre
This investigation was conducted remotely through a centralised web-based platform operated by the manufacturer. Healthcare professionals acting as readers accessed the platform using individual user credentials (username and password); all assessments were recorded on the platform and access logs are maintained to ensure full traceability of reader interactions.
Compliance Statement
The clinical investigation will be conducted according to the Clinical Investigation Plan (CIP) and other applicable guidances and regulations. This includes compliance with:
- The ethical principles originating from the
World Medical Association's Declaration of Helsinki - Harmonized standard
UNE-EN ISO 14155:2020 Regulation (EU) 2017/745 on medical devices (MDR), including the applicableGeneral Safety and Performance Requirements (GSPR)as outlined in Annex I, and the requirements ofAnnex XV(Chapter I and Chapter II, Section 3)- Harmonized standard
UNE-EN ISO 13485:2016 MDCG 2024-3for its structural and content expectations,MDCG 2021-8concerning application requirements, andMDCG 2020-10/1 Rev 1for safety reporting timelines and definitionsRegulation (EU) 2016/679(GDPR)- Spanish
Organic Law 3/2018on the Protection of Personal Data and guarantee of digital rights.
All data processing within the device is carried out in accordance with the highest standards of data protection and privacy. Patient information is managed in an encrypted manner to ensure confidentiality and security.
The research team assumes the role of Data Controller, responsible for the collection and management of study data. Legit.Health acts as the Data Processor and is not involved in the processing of patient data.
The storage and transfer of data comply with European data protection regulations. At the conclusion of the study, all information stored in the device will be permanently and securely deleted.
The device employs robust technical and organizational security measures to safeguard personal data against unauthorized access, alteration, loss, or processing.
Report Date
The signed report date is recorded on the signature block at the end of this report.
Report author(s)
The full name, the ID and the signature for the authorship, as well as the approval process of this document, can be found in the verified commits at the repository. This information is saved alongside the digital signature, to ensure the integrity of the document.
Table of contents
Table of contents
- Research Title
- Product Identification
- Sponsor Identification and Contact
- Identification of the Clinical Investigation Plan (CIP)
- Public Access Database
- Research Team
- Compliance Statement
- Report Date
- Report author(s)
- Table of contents
- Abbreviations and Definitions
- Summary
- Introduction
- Material and methods
- Results
- Initiation and completion
- Study population characteristics
- Study case set
- CIP Compliance and Deviations
- Primary Analysis
- Acceptance Criteria Verification
- Referral assessment (Stage 3)
- Device malignancy performance
- Adverse events and adverse reactions to the product
- Product deficiencies
- Subgroup analysis for special populations
- Discussion and Overall Conclusions
- Investigators and Administrative Structure of Clinical Research
- Report Annexes
Abbreviations and Definitions
- AE: Adverse Event
- AEMPS: Spanish Agency of Medicines and Medical Devices
- AEP: Adverse Reaction to Product
- AUC: Area Under the ROC Curve
- CAD: Computer-Aided Diagnosis
- CMD: Data Monitoring Committee
- CIP: Clinical Investigation Plan
- CUS: Clinical Utility Questionnaire
- DLQI: Dermatology Quality of Life Index
- GCP: Standards of Good Clinical Practice
- ICH: International Conference of Harmonization
- IFU: Instructions For Use
- IRB: Institutional Review Board
- N/A: Not Applicable
- NCA: National Competent Authority
- PI: Principal Investigator
- PPV: Positive Predictive Value
- NPV: Negative Predictive Value
- SAE: Serious Adverse Events
- SAEP: Serious Adverse Event to Product
- SUAEP: Serious and Unexpected Adverse Event to the Product
- SUS: System Usability Scale
Summary
Title
Simulated-use multi-reader multi-case (MRMC) investigation of the device: effect on healthcare practitioners' top-1 diagnostic accuracy on a curated image set representative of Fitzpatrick phototype V and VI presentations.
Nature and positioning of the evidence
This is a simulated-use multi-reader multi-case (MRMC) investigation performed entirely on retrospective, fully anonymised dermatological images sourced from public dermatology atlases. No patients were newly recruited, no patient-identifiable data was processed and no therapeutic or diagnostic intervention was performed on any patient as a consequence of the investigation.
Under MDCG 2020-6 Appendix III, this kind of investigation constitutes Rank 11 evidence (simulated-use reader study on retrospective images); it is not clinical data on real patients within the meaning of MDR Article 2(48). Per MDCG 2020-1 §4.4 it contributes Pillar 3 Clinical Performance supporting evidence at Rank 11 — measuring the clinician's diagnostic decision-making when using the device's Top-5 prioritised differential view on curated phototype V and VI images — positioned below the Rank 2–4 prospective real-patient studies that carry the primary Pillar 3 weight. Extrapolation to real-world consulting populations, patient-outcome claims, time-to-correct-therapy claims, disease-burden claims or healthcare-economics claims is outside the scope of this report and is handled — with the appropriate real-world evidence — in the Clinical Evaluation Report (R-TF-015-003).
Introduction
Discrepancies between diagnoses made by primary-care practitioners and dermatologists are documented in the SotA literature (summarised in R-TF-015-011) with concordance rates between 57% and 65.52%; together with the limited availability of specialist dermatologists, this concordance gap motivates the development of diagnostic decision-support tools for HCPs. The source MRMC investigations of the device (SAN_2024, BI_2024, PH_2024) used public-atlas image sets that predominantly represented Fitzpatrick phototypes I–IV; Fitzpatrick phototypes V and VI were under-represented in those image sets. This investigation evaluates, under simulated-use conditions on a curated image set representative of Fitzpatrick V–VI presentations of the same dermatological conditions, whether the device changes HCP top-1 diagnostic accuracy relative to unaided reading of the same images.
Objectives
Primary objective
- To validate that the information provided by the device increases the top-1 diagnostic accuracy of healthcare professionals (HCPs) in the diagnosis of multiple dermatological conditions presented on Fitzpatrick phototype V–VI skin on the curated image set.
Secondary objectives
- To characterise, as an exploratory descriptive, the proportion of cases for which the reader, with the device's output available, considers that specialist referral is appropriate.
- To report, as an exploratory descriptive, device-level malignancy-detection performance (ROC AUC) on the curated image set against the atlas-labelled ground truth.
No confirmatory malignancy-specific diagnostic-accuracy claim or malignancy-referral-sensitivity claim is made on the basis of this investigation. The 10 malignant cases in the image set (7 melanoma and 3 basal cell carcinoma) are sufficient for descriptive reporting only; confirmatory malignancy performance is addressed by the dedicated NMSC clinical investigation cited in the Clinical Evaluation Report and by the Post-Market Clinical Follow-up Plan.
Acceptance criteria
The acceptance criteria for this investigation are specified in the Clinical Investigation Plan (R-TF-015-004). Their status against the locked dataset is reported in the Results section ("Acceptance Criteria Verification").
Population
The population consists of healthcare professionals (HCPs) representative of the device's declared intended user groups per the IFU — dermatologists, primary care / family and community medicine physicians and nurses with clinical responsibility for skin or wound assessment — evaluating anonymised images representative of Fitzpatrick phototype V and VI presentations of multiple dermatological conditions. The Clinical Investigation Plan requires a minimum of 5 HCPs in the primary analysis cohort, with open recruitment up to approximately 20 HCPs planned to tighten the confidence intervals on the primary pooled-accuracy endpoint. Realised enrolment and the primary-analysis cohort composition are reported in §Study population characteristics.
Sample size
The realised sample size for the primary analysis cohort is reported in the Results section (see §Primary Analysis and §CIP Compliance and Deviations). The investigation dataset satisfies the statistical-power requirements pre-specified in the Clinical Investigation Plan for the pooled paired top-1 diagnostic-accuracy endpoint.
Design and methods
This is a prospective, observational, multi-reader, multi-case (MRMC) self-controlled investigation using a progressive information-disclosure design. For each of the 149 clinical cases, every reader completes a sequence of three assessment stages with progressively more device output:
- Stage 1 (Unassisted diagnosis): the reader views the clinical image and patient anamnesis and provides their primary diagnosis without any device output.
- Stage 2 (Assisted diagnosis): the device's differential diagnosis (top-5 ICD-11 probability distribution) is additionally displayed. The reader provides their revised top-1 diagnosis.
- Stage 3 (Referral assessment): the device's malignancy probability, referral recommendation and diagnostic entropy are additionally displayed. The reader decides whether to refer the case to a specialist.
The three stages are completed sequentially for each case before the reader proceeds to the next case. The order of case presentation is independently randomised for each reader. The self-controlled paired comparison between Stage 1 and Stage 2 for the same reader on the same case constitutes the primary observation.
The three-stage progressive-disclosure protocol reproduced the user-facing outputs that the Instructions For Use mandate of the integrating system: the Top-5 prioritised differential view (Stage 2), the malignancy probability gauge, the referral recommendation and the diagnostic-entropy indicator (Stage 3). The clinical performance measured in this investigation is therefore the clinical performance the device guarantees when integrated in accordance with the IFU's integration-requirements section; it is not contingent on integrator design choices.
Results
Summary results are provided in the Results section. The numerical tables below are computed from the locked, de-identified investigation dataset in accordance with the statistical methods pre-specified in the Clinical Investigation Plan; the dataset snapshot corresponding to the signed version of this report is held by the manufacturer within the QMS and is available for audit on request.
Conclusions
On the curated image set, the device significantly increases the pooled top-1 diagnostic accuracy of healthcare professionals evaluating Fitzpatrick phototype V–VI skin presentations. The effect is preserved in the board-certified sensitivity subset and is consistent with the effect sizes reported in the source MRMC investigations (SAN_2024, BI_2024, PH_2024). At Stage 3, device-assisted referral decisions achieve high sensitivity for malignant presentations. Under MDCG 2020-6 Appendix III these findings constitute Rank 11 simulated-use evidence; per MDCG 2020-1 §4.4 they contribute Pillar 3 Clinical Performance supporting evidence for Fitzpatrick-phototype generalisability (the clinician, using the device's Top-5 prioritised differential, makes measurably better diagnostic decisions on V and VI presentations) at a lower evidence rank than the prospective real-patient studies. They are distinct from real-world clinical data on patients under MDR Article 2(48) and are positioned within the overall body of clinical evidence at Clinical Evaluation Report level (R-TF-015-003).
Introduction
Discrepancies between diagnoses made by primary-care practitioners and dermatologists are documented in the SotA literature (summarised in R-TF-015-011) with concordance rates of 57%–65.5%. Limited specialist availability in some geographies is an additional and separate real-world constraint on dermatological care. Together these motivate the development of AI-based diagnostic decision-support tools for use by HCPs at the point of care.
The source MRMC investigations of the device (SAN_2024, BI_2024, PH_2024) used public-atlas image sets that predominantly represented Fitzpatrick phototypes I–IV; coverage of Fitzpatrick phototypes V and VI was limited in those image sets. To provide supporting Fitzpatrick-phototype generalisability evidence, 149 anonymised images representative of Fitzpatrick phototype V and VI presentations of the same dermatological conditions evaluated in the source investigations were sourced from public dermatological atlases and presented to healthcare professionals — dermatologists, primary-care physicians and nurses with skin or wound-assessment responsibility — through a centralised web-based platform operated by the manufacturer.
This investigation does not set out to measure real-world clinical outcomes, triage outcomes, teledermatology outcomes, time-to-treatment outcomes or healthcare-economics outcomes. Claims of that kind require real-world clinical data on real patients and are addressed at Clinical Evaluation Report level.
The scope of this investigation is narrower and specific: to measure, under simulated-use conditions on the curated Fitzpatrick V–VI image set, whether the device's output changes HCPs' top-1 diagnostic accuracy when compared with the same HCPs' unaided reading of the same images. Under MDCG 2020-6 Appendix III the resulting evidence is Rank 11 (simulated-use reader-based measurement). Per MDCG 2020-1 §4.4 it contributes Pillar 3 Clinical Performance supporting evidence for Fitzpatrick-phototype generalisability — specifically, that the clinician, using the device's Top-5 prioritised differential view, makes measurably better top-1 diagnostic decisions on Fitzpatrick V and VI presentations — at a lower evidence rank than the Rank 2–4 prospective real-patient studies. It does not substitute for, and is not classified as, Pillar 2 analytical-performance evidence; the Pillar 2 API-level analytical claim across the 346 ICD-11 categories is evidenced independently of this investigation by the manufacturer's technical-validation record and the four published peer-reviewed severity-validation studies.
Material and methods
Product Description
This section contains a short summary of the device. A complete description of the intended purpose, including device description, can be found in the record Legit.Health Plus description and specifications.
Product description
The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.
The generated data is intended to aid healthcare practitioners and organizations in their clinical decision-making process, thus enhancing the efficiency and accuracy of care delivery.
The device should never be used to confirm a clinical diagnosis. On the contrary, its result is one element of the overall clinical assessment. Indeed, the device is designed to be used when a healthcare practitioner chooses to obtain additional information to consider a decision.
Intended purpose
The device is a computational software-only medical device intended to support health care providers in the assessment of skin structures, enhancing efficiency and accuracy of care delivery, by providing:
- quantification of intensity, count, extent of visible clinical signs
- interpretative distribution representation of possible International Classification of Diseases (ICD) categories.
Intended previous uses
No specific intended use was designated in prior stages of development.
Product changes during clinical research
The device maintained a consistent performance and features throughout the entire clinical research process. No alterations or modifications were made during this period.
Clinical Investigation Plan
Objectives
This investigation validates that the information provided by the device increases the top-1 diagnostic accuracy of healthcare professionals in the diagnosis of multiple dermatological conditions presented on Fitzpatrick phototype V–VI skin on the curated image set.
Design
This is a prospective, observational, multi-reader, multi-case (MRMC) self-controlled investigation. The investigation does not involve an active or control group: each reader acts as their own comparator, first providing a top-1 diagnosis without the device (Stage 1) and then revising the top-1 diagnosis with the device's output available (Stage 2). The sequential per-case presentation ensures that the unassisted diagnosis is recorded before the reader sees any device output, preventing information leakage between conditions.
Ethical considerations
This study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. As applicable, approval from the relevant Ethics Committee was obtained prior to the initiation of the study. When applicable, modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Participants were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The PI was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Participants were provided with a copy of their signed consent form for their records.
Data confidentiality
Current legislation will be complied with in terms of data confidentiality protection (European Regulation 2016/679, of 27 April, on the protection of natural persons with regard to the processing of personal data and the free movement of such data and Organic Law 3/2018, of 5 December, on Personal Data Protection and guarantee of digital rights). For this purpose, when applicable, each participant will receive an alphanumeric identification code in the study that will not include any data allowing personal identification (coded CRD). The Principal Investigator will have an independent list that will allow the connection of the identification codes of the patients participating in the study with their clinical and personal data. This document will be filed in a secure area with restricted access, under the custody of the Principal Investigator and will never leave the centre.
Once the paper CRDs are completed and closed by the Principal Investigator, the data will be transferred to a database.
As in the CRDs, the Database will comply with current legislation in terms of data confidentiality protection (European Regulation 2016/679, of 27 April, on the protection of natural persons about the processing of personal data and the free movement of such data and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and guarantee of digital rights) in which no data allowing personal identification of patients will be included.
Data Quality Assurance
The Principal Investigator is responsible for reviewing and approving the protocol, signing the Principal Investigator commitment, guaranteeing that the persons involved in the centre will respect the confidentiality of patient information and protect personal data, and reviewing and approving the final study report together with the sponsor. All the clinical members of the research team assess the eligibility of the patients in the study, inform and request written informed consent, collect the source data of the study in the clinical record and transfer them to the Data Collection Notebook (DCN) or Data Collection Forms (CRF).
Subject Population
The investigation enrolled healthcare professionals representative of the device's declared intended user groups per the IFU — dermatologists, primary care / family and community medicine physicians and nurses with clinical responsibility for skin or wound assessment. Each enrolled reader completed an onboarding flow that captured a professional-profile form, CV and certification or training evidence, a conflict-of-interest declaration, a signed participation agreement and a data-protection acknowledgement before access to the investigation's image-annotation workflow was enabled. Onboarding metadata is retained by the manufacturer within the QMS and is available for audit on request; it is summarised in the Reader Demographics table below.
Sample size
The Clinical Investigation Plan requires a minimum of 5 healthcare professionals to ensure valid MRMC statistics (Obuchowski 2004; Hillis 2011), with open recruitment up to approximately 20 HCPs planned to tighten the primary-endpoint confidence interval. At the 5-reader floor, 149 cases × 5 readers = 745 paired observations, which exceeds the threshold of approximately 200 discordant pairs required for a two-sided McNemar test detecting a 10 percentage-point improvement at α = 0.05 with 80 % power (Lachin 1992). Adjusted for within-reader correlation (ICC ≈ 0.15) the effective sample size at the floor is approximately 460 independent observations, with the primary-endpoint conclusion robust to ICC sensitivity between 0.05 and 0.30. The realised number of readers in the primary analysis cohort, the realised paired-observation count, the realised per-reader completion range and the realised discordant-pair count (b + c) driving the McNemar variant applied are reported in §Primary Analysis.
Inclusion criteria
Reader (HCP) inclusion criteria:
- Healthcare professionals whose clinical scope of practice routinely includes the assessment of skin conditions, falling into any of the following categories: (a) board-certified dermatologists or dermatology residents; (b) board-certified primary care / family and community medicine physicians or primary-care residents; or (c) nurses with clinical responsibility for skin or wound assessment.
- Completed onboarding flow: professional profile, CV, certification or training evidence, conflict-of-interest declaration, signed participation agreement and data-protection acknowledgement.
Image inclusion criteria:
- Anonymised images representative of Fitzpatrick phototype V or VI presentations of one of the pre-specified dermatological conditions, which have passed the per-image quality-control review.
Exclusion criteria
Reader (HCP) exclusion criteria:
- Current specialty or scope of practice that does not routinely include the clinical assessment of skin conditions (e.g. anatomical pathology, clinical neurophysiology, radiology, laboratory medicine or purely non-clinical administrative roles). Readers meeting this criterion are screened out and their submissions are excluded from all analyses.
- Declared conflict of interest that cannot be managed under the conflict-of-interest handling procedure.
- Onboarding flow not completed.
Image exclusion criteria:
- Images of insufficient quality for proper analysis after the quality-control review.
- Images in which the skin-condition morphology has been altered beyond clinical recognition.
Statistical Analysis
Pre-specified analyses
The primary endpoint is the paired difference in top-1 diagnostic accuracy (per reader, per case, correct vs incorrect against the atlas-labelled ICD-11 reference standard) between Stage 1 (unassisted) and Stage 2 (assisted), pooled across all readers in the primary analysis cohort. This paired binary outcome is analysed using a two-sided McNemar test for paired proportions (exact mid-P test when b + c < 25; continuity-corrected chi-square otherwise), at alpha = 0.05. Wilson score 95% confidence intervals are reported for each stage-specific accuracy; differences in paired proportions are reported with their Newcombe hybrid-score 95% confidence intervals.
The pre-specified secondary endpoints are reported as exploratory descriptives (not used to support confirmatory claims on this investigation alone):
- Stage 3 referral decisions split by malignant and benign atlas-labelled ground truth; referral rate is reported with Wilson score 95 % confidence intervals. Given that the image set contains only 10 malignant cases, the Stage 3 malignant-case referral rate is a descriptive safety-relevant observation, not a confirmatory statistical claim.
- Device-level malignancy-detection ROC AUC against the atlas-labelled ground truth, reported as a descriptive point estimate with its 95 % confidence interval.
Exploratory and hypothesis-generating analyses
Per-pathology analyses (stratified by condition) and per-specialty analyses (dermatology, primary care, nursing) are pre-specified as exploratory, hypothesis-generating analyses. Because the sample size is powered only for the primary pooled-accuracy comparison and not for per-stratum comparisons, and because no adjustment for multiple comparisons is applied at this stratification level, p-values from these contrasts are reported for descriptive purposes and are not used to support confirmatory claims. Where a per-pathology cell contains fewer than 15 observations, the cell is flagged as having limited interpretability. A board-certified subset sensitivity analysis is also reported as descriptive.
Handling of incomplete reader data
The pre-specified primary analysis population comprises all cases reviewed by any enrolled HCP meeting the §Inclusion criteria under the Stage 1 → Stage 2 paired protocol (analysis at the paired-observation level, not at the reader level). Readers who completed only a partial number of cases contribute the observations they did complete. A sensitivity analysis restricted to HCPs who completed at least 50% of the 149 cases, and a further sensitivity analysis restricted to HCPs who completed all 149 cases, are performed and reported; the primary result is considered robust if the direction and magnitude of the estimated effect are preserved across both sensitivity analyses.
Software
Analyses are performed using a deterministic, version-controlled statistical analytics environment maintained by the manufacturer, applying the pre-specified statistical methods (McNemar, Wilson, Newcombe, ROC AUC) to the de-identified exported dataset. All analytical code is held under version control and is available for audit on request.
Results
Initiation and completion
Data collection began on 21 January 2026 and ended on 17 April 2026; database closure and data lock were performed on 17 April 2026. The figures reported in the tables below are derived from the locked de-identified dataset.
Study population characteristics
Reader demographics — all enrolled readers
Cohort: all enrolled readers (enrolment transparency view, including screen failures) (n = 19).
Of these, 3 reader(s) were screened out as protocol deviations (specialty outside device intended user scope); their submissions are excluded from all analyses. See CIP Compliance and Deviations.
| Characteristic | Value | n | % |
|---|---|---|---|
| Specialty | Dermatology | 9 | 47.4% |
| General / Primary care | 5 | 26.3% | |
| Nursing | 3 | 15.8% | |
| Other | 2 | 10.5% | |
| Qualification (CV-derived) | Fully qualified (attending / licensed) | 7 | 36.8% |
| Resident / trainee (target specialty) | 9 | 47.4% | |
| Not target (screen failure) | 3 | 15.8% | |
| AI experience | None | 5 | 26.3% |
| Low | 5 | 26.3% | |
| Medium | 7 | 36.8% | |
| High | 2 | 10.5% | |
| Years of experience | 0–4 years | 11 | 57.9% |
| 5–9 years | 1 | 5.3% | |
| 10–14 years | 4 | 21.1% | |
| 15+ years | 3 | 15.8% | |
| Country | Spain | 19 | 100.0% |
Reader demographics — primary analysis cohort
The primary analysis cohort consists of readers meeting the CIP §Inclusion criteria — healthcare professionals whose clinical scope of practice includes skin assessment (dermatology, primary care, nursing) — with completed onboarding and no documented screen-failure deviation. Enrolment proceeded through a published-invitation pathway (direct invitation to HCPs in the Principal Investigator's professional network and to clinical contacts of the manufacturer); every onboarding record was complete at the data-lock date, including a signed conflict-of-interest declaration. The cohort is split across the three intended-user specialties (dermatology, primary care, nursing) per the demographics table below; per-specialty cell counts are made available in the table and support the exploratory per-specialty descriptives reported in §Primary Analysis — By specialty.
The enrolment funnel for this investigation is: 19 HCPs onboarded → 3 documented screen failures (R-03, R-11, R-15; see §CIP Compliance and Deviations) → 16 HCPs in the primary analysis cohort, 0 post-onboarding withdrawals and 0 post-lock exclusions.
Cohort: primary analysis cohort — healthcare professionals meeting the CIP §Inclusion criteria (dermatology, primary care, nursing), excluding documented screen failures (n = 16).
| Characteristic | Value | n | % |
|---|---|---|---|
| Specialty | Dermatology | 9 | 56.3% |
| General / Primary care | 4 | 25.0% | |
| Nursing | 3 | 18.8% | |
| Qualification (CV-derived) | Fully qualified (attending / licensed) | 7 | 43.8% |
| Resident / trainee (target specialty) | 9 | 56.3% | |
| AI experience | None | 4 | 25.0% |
| Low | 3 | 18.8% | |
| Medium | 7 | 43.8% | |
| High | 2 | 12.5% | |
| Years of experience | 0–4 years | 8 | 50.0% |
| 5–9 years | 1 | 6.3% | |
| 10–14 years | 4 | 25.0% | |
| 15+ years | 3 | 18.8% | |
| Country | Spain | 16 | 100.0% |
The table above reports, for the primary analysis cohort, the per-specialty reader counts (dermatology, primary care, nursing), the board-certified vs trainee split, the years of documented clinical experience per reader, and the geographic distribution. Pooled estimates in this report are reported with their confidence intervals; per-specialty descriptives are reported as exploratory only because the per-specialty stratum sizes are not sized for confirmatory statistical comparison.
Study case set
The 149 clinical cases represent Fitzpatrick phototype V and VI presentations of the dermatological conditions evaluated in the three source MRMC investigations (SAN_2024, BI_2024, PH_2024), sourced from public dermatological atlases.
Total cases: 149. Malignant: 10 (6.7%). Benign: 139 (93.3%). Fitzpatrick phototype V: 69; phototype VI: 80.
| ICD-11 code | Condition | Total | FP-V | FP-VI | Risk class |
|---|---|---|---|---|---|
EA89 | Eczematous dermatitis | 13 | 6 | 7 | Benign |
ED92.0 | Hidradenitis suppurativa | 11 | 5 | 6 | Benign |
ED80.3 | Nodular acne | 10 | 3 | 7 | Benign |
1B72 | Impetigo | 9 | 3 | 6 | Benign |
EA90.40 | Generalised pustular psoriasis | 8 | 2 | 6 | Benign |
1F28 | Tinea | 8 | 3 | 5 | Benign |
ED80.Z | Acne | 7 | 3 | 4 | Benign |
2C30 | Cutaneous melanoma | 7 | 5 | 2 | Malignant |
2F20.Z | Melanocytic nevus | 7 | 5 | 2 | Benign |
EA90.0 | Plaque psoriasis | 7 | 4 | 3 | Benign |
EA81 | Seborrhoeic dermatitis | 6 | 3 | 3 | Benign |
2F21.0 | Seborrhoeic keratosis | 6 | 2 | 4 | Benign |
EB05 | Urticaria | 6 | 2 | 4 | Benign |
EA90.42 | Palmoplantar pustulosis | 5 | 1 | 4 | Benign |
EB40 | Pemphigus | 5 | 0 | 5 | Benign |
ED80.4 | Severe inflammatory acne | 5 | 3 | 2 | Benign |
EB2Y | Subcorneal pustular dermatosis | 5 | 2 | 3 | Benign |
EH67.0 | Acute generalized exanthematous pustulosis | 4 | 3 | 1 | Benign |
2C32 | Basal cell carcinoma | 3 | 1 | 2 | Malignant |
EA90 | Psoriasis | 3 | 2 | 1 | Benign |
EK90.0 · XH36H6 | Actinic keratosis | 2 | 2 | 0 | Benign |
ED70 | Alopecia | 2 | 2 | 0 | Benign |
2F20.1 | Atypical melanocytic nevus | 2 | 2 | 0 | Benign |
EE12.1 | Onychomycosis | 2 | 0 | 2 | Benign |
EA90.4 | Pustular psoriasis | 2 | 2 | 0 | Benign |
1E91 | Zoster | 2 | 2 | 0 | Benign |
EE80.0 | Granuloma annulare | 1 | 0 | 1 | Benign |
EH90.Z | Pressure ulcer | 1 | 1 | 0 | Benign |
CIP Compliance and Deviations
The investigation was conducted in substantial compliance with the Clinical Investigation Plan. Per-reader completion is summarised below. Readers are analysed at the paired-observation level per the pre-specified primary analysis population (see §Statistical Analysis — Handling of incomplete reader data); sensitivity analyses restricted to readers completing at least 50% of the 149 cases and to readers completing all 149 cases are reported alongside the primary analysis.
Total planned observations per reader: 3 stages × 149 cases = 447. Primary cohort: 16 reader(s). Primary cohort readers with full completion of all three stages: 12 of 16. 3 reader(s) flagged as screen failures; their submissions are retained in the dataset for audit but excluded from all analyses.
| Reader | Qualification (CV-derived) | Stage 1 (unassisted) | Stage 2 (assisted) | Stage 3 (referral) | Cohort |
|---|---|---|---|---|---|
R-01 | Dermatology MIR resident | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-02 | Family & community medicine MIR R2 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-04 | Dermatology MIR resident | 149 / 149 | 148 / 149 | 149 / 149 | Primary |
R-05 | Board-certified dermatologist | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-06 | Licensed nurse (skin/wound assessment) | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-07 | Dermatology MIR R2 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-10 | Board-certified dermatologist | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-12 | Dermatology MIR R3 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-13 | Family & community medicine MIR R3 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-14 | Board-certified family & community medicine attending | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-16 | Licensed nurse (skin/wound assessment) | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-17 | Licensed nurse (skin/wound assessment) | 149 / 149 | 148 / 149 | 149 / 149 | Primary |
R-18 | Family & community medicine MIR R2 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-19 | Dermatology MIR R3 | 149 / 149 | 149 / 149 | 149 / 149 | Primary |
R-08 | Board-certified dermatologist | 147 / 149 | 147 / 149 | 147 / 149 | Primary |
R-09 | Dermatology MIR R3 | 146 / 149 | 147 / 149 | 147 / 149 | Primary |
R-03 | Clinical neurophysiology MIR R4 | 149 / 149 | 149 / 149 | 149 / 149 | Screen failure (excluded) |
R-11 | Anatomical pathology MIR R1 | 149 / 149 | 149 / 149 | 149 / 149 | Screen failure (excluded) |
R-15 | Medical graduate, no MIR (pre-residency) | 149 / 149 | 149 / 149 | 149 / 149 | Screen failure (excluded) |
Protocol deviations — screen failures
During the investigation, three readers were identified whose professional scope lies outside the device's declared intended user population per CIP §Exclusion criteria. Each was confirmed, through review of the CV uploads provided during onboarding, not to meet the §Inclusion criteria (dermatology, primary care or nursing with skin or wound-assessment scope). Their submissions are retained in the underlying dataset for audit traceability but are excluded from the primary analysis, from the board-certified sensitivity subset and from all per-specialty breakdowns.
- R-03: self-reported specialty "other"; actual specialty confirmed from CV as a resident in Clinical Neurophysiology. Clinical neurophysiology is an electrodiagnostic specialty and does not form part of the device's intended user population. Screen failure documented on 17 April 2026 as part of the data-lock pass.
- R-11: self-reported specialty "other"; actual specialty confirmed from CV as a resident in Anatomical Pathology / histopathology. Anatomical pathology is a histological specialty based on microscopic examination of tissue specimens and does not form part of the device's intended user population. Screen failure documented on 17 April 2026 as part of the data-lock pass.
- R-15: self-reported specialty "general"; CV confirms a completed medical degree without an ongoing residency programme in any target specialty. Does not meet any of the three §Inclusion categories (dermatology, primary care or nursing with skin or wound-assessment scope). Screen failure documented on 17 April 2026 as part of the data-lock pass.
Protocol deviations — partial completion
Every reader who satisfied the CIP §Inclusion criteria met or exceeded the pre-specified 95% completion threshold at the data-lock date across all three stages (diagnosis, assisted-diagnosis, referral). Per-reader completion figures are tabulated above; the small number of individually skipped cases in the primary analysis cohort are well below the 5% tolerance and are handled by the paired-observation primary analysis population.
No other protocol deviations have been identified as of the report date.
Primary Analysis
Paired diagnostic accuracy (primary endpoint)
The primary endpoint is the pooled paired difference in top-1 diagnostic accuracy between the Stage 1 (unassisted) and Stage 2 (device-assisted) conditions, computed at the paired-observation level across the 16 readers in the primary analysis cohort × 149 cases. Results are reported for the primary cohort (healthcare professionals meeting the CIP §Inclusion criteria — dermatology, primary care, nursing — with documented screen failures excluded) and, as a sensitivity analysis, for the board-certified subset of that primary cohort.
The table below summarises, for each cohort (primary and board-certified subset): the number of readers, the number of paired observations contributed, the Stage 1 (unassisted) and Stage 2 (device-assisted) pooled top-1 accuracies with Wilson score 95 % confidence intervals, the paired difference (Stage 2 − Stage 1) with its Newcombe hybrid-score 95 % confidence interval, the discordant-pair count (b + c) driving the McNemar test, the McNemar test variant applied (exact mid-P where b + c < 25; continuity-corrected χ² otherwise) and the resulting p-value. The pre-specified primary-endpoint acceptance criterion is a positive paired difference of at least +10 percentage points, tested with McNemar's test at α = 0.05. The table is rendered as a static tabular figure in the exported PDF.
| Cohort | Readers | Paired obs. | Unassisted accuracy (95% CI) | Assisted accuracy (95% CI) | Absolute improvement (95% CI) | McNemar p |
|---|---|---|---|---|---|---|
| Primary cohort HCPs per CIP §Inclusion | 16 | 2376 | 41.8% [39.8%, 43.8%] | 65.1% [63.1%, 67.0%] | +23.3 pp [+21.5 pp, +25.1 pp] | < 0.0001 |
| Board-certified subset Sensitivity analysis | 7 | 1039 | 38.5% [35.6%, 41.5%] | 69.6% [66.7%, 72.3%] | +31.1 pp [+28.1 pp, +34.2 pp] | < 0.0001 |
The primary-cohort improvement is the primary regulatory endpoint of the investigation. The board-certified subset improvement is reported as a sensitivity analysis to demonstrate that the effect is preserved among the most rigorously qualified readers; it is not an acceptance-critical figure on its own. The per-reader completion range (minimum, median, maximum number of cases completed) is reported in the Completion Summary above.
By specialty (exploratory)
Per-specialty results are reported below as an exploratory, hypothesis-generating analysis (see §Statistical Analysis). Given the limited number of readers per specialty stratum, per-stratum estimates are subject to higher variability than the pooled primary estimate.
| Specialty | Readers | Paired obs. | Unassisted | Assisted | Δ | McNemar p |
|---|---|---|---|---|---|---|
| Dermatology | 9 | 1334 | 47.9% | 64.2% | +16.3 pp | < 0.0001 |
| Primary care | 4 | 596 | 36.6% | 64.9% | +28.4 pp | < 0.0001 |
| Nursing | 3 | 446 | 30.5% | 67.7% | +37.2 pp | < 0.0001 |
By pathology (exploratory)
Per-pathology accuracy breakdowns for the primary cohort are reported below as an exploratory, hypothesis-generating analysis. Conditions with fewer than 15 paired observations are reported separately because the per-condition estimates at low counts are dominated by individual case outcomes.
Cohort: primary analysis cohort (all HCPs per CIP §Inclusion). Conditions with ≥ 15 paired observations shown first; lower-powered conditions follow and should be interpreted as exploratory.
| ICD-11 | Condition | Cases | Paired obs. | Unassisted | Assisted | Δ | Risk class |
|---|---|---|---|---|---|---|---|
EA89 | Eczematous dermatitis | 13 | 207 | 62.3% | 84.5% | +22.2 pp | Benign |
ED92.0 | Hidradenitis suppurativa | 11 | 176 | 84.7% | 94.9% | +10.2 pp | Benign |
ED80.3 | Nodular acne | 10 | 158 | 20.3% | 53.2% | +32.9 pp | Benign |
1B72 | Impetigo | 9 | 142 | 31.7% | 72.5% | +40.8 pp | Benign |
1F28 | Tinea | 8 | 128 | 39.8% | 66.4% | +26.6 pp | Benign |
EA90.40 | Generalised pustular psoriasis | 8 | 128 | 14.8% | 39.8% | +25.0 pp | Benign |
2F20.Z | Melanocytic nevus | 7 | 112 | 31.3% | 52.7% | +21.4 pp | Benign |
2C30 | Cutaneous melanoma | 7 | 112 | 63.4% | 79.5% | +16.1 pp | Malignant |
ED80.Z | Acne | 7 | 112 | 34.8% | 46.4% | +11.6 pp | Benign |
EA90.0 | Plaque psoriasis | 7 | 111 | 36.9% | 45.9% | +9.0 pp | Benign |
EB05 | Urticaria | 6 | 96 | 41.7% | 82.3% | +40.6 pp | Benign |
2F21.0 | Seborrhoeic keratosis | 6 | 96 | 78.1% | 89.6% | +11.5 pp | Benign |
EA81 | Seborrhoeic dermatitis | 6 | 96 | 45.8% | 78.1% | +32.3 pp | Benign |
EB40 | Pemphigus | 5 | 80 | 28.7% | 62.5% | +33.8 pp | Benign |
ED80.4 | Severe inflammatory acne | 5 | 80 | 15.0% | 56.3% | +41.3 pp | Benign |
EA90.42 | Palmoplantar pustulosis | 5 | 80 | 62.5% | 86.3% | +23.8 pp | Benign |
EB2Y | Subcorneal pustular dermatosis | 5 | 79 | 0.0% | 0.0% | +0.0 pp | Benign |
EH67.0 | Acute generalized exanthematous pustulosis | 4 | 64 | 0.0% | 0.0% | +0.0 pp | Benign |
2C32 | Basal cell carcinoma | 3 | 48 | 41.7% | 77.1% | +35.4 pp | Malignant |
EA90 | Psoriasis | 3 | 48 | 18.8% | 45.8% | +27.1 pp | Benign |
ED70 | Alopecia | 2 | 32 | 84.4% | 96.9% | +12.5 pp | Benign |
2F20.1 | Atypical melanocytic nevus | 2 | 32 | 34.4% | 65.6% | +31.3 pp | Benign |
EE12.1 | Onychomycosis | 2 | 32 | 68.8% | 87.5% | +18.8 pp | Benign |
EA90.4 | Pustular psoriasis | 2 | 32 | 3.1% | 31.3% | +28.1 pp | Benign |
1E91 | Zoster | 2 | 32 | 62.5% | 93.8% | +31.3 pp | Benign |
EK90.0 · XH36H6 | Actinic keratosis | 2 | 31 | 64.5% | 90.3% | +25.8 pp | Benign |
EE80.0 | Granuloma annulare | 1 | 16 | 25.0% | 87.5% | +62.5 pp | Benign |
EH90.Z | Pressure ulcer | 1 | 16 | 25.0% | 31.3% | +6.3 pp | Benign |
Acceptance Criteria Verification
The table below reports every pre-specified acceptance criterion defined in the Clinical Investigation Plan together with its status against the locked dataset. For each criterion it states: the metric, the analysis population, the pre-specified threshold (type and value), the observed point estimate on the primary cohort with its 95 % confidence interval, the pre-specified primary / secondary / exploratory designation, and the pass / fail / descriptive verdict. The single primary confirmatory criterion is the paired +10 percentage-point improvement in pooled top-1 diagnostic accuracy; its verdict is reported first in the table. All remaining criteria are reported as secondary or exploratory descriptives. The table is rendered as a static tabular figure in the exported PDF.
| ID | Acceptance criterion (per CIP) | Threshold | Observed | Status |
|---|---|---|---|---|
AC-1 | Top-1 diagnostic accuracy of the primary cohort improves by at least the pre-specified effect-size threshold under device assistance, with a statistically significant paired difference. | Absolute improvement ≥ 10 pp AND McNemar p < 0.05 | 41.8% → 65.1% (Δ = +23.3 pp, 95% CI [+21.5 pp, +25.1 pp]); McNemar p = < 0.0001 | Met |
AC-2 | Device-assisted top-1 diagnostic accuracy on the curated image set meets or exceeds the state-of-the-art absolute-value threshold for aided-HCP performance. | Assisted accuracy ≥ 60.0% (SotA baseline from R-TF-015-011) | 65.1% (95% CI [63.1%, 67.0%]) | Met |
AC-3 | Reader referral sensitivity for malignant cases when using the device output meets the safety-critical pre-specified threshold. | Malignant-case referral sensitivity ≥ 90.0% | 93.8% (150 / 160; 95% CI [88.9%, 96.6%]) | Met |
Interpretation. Each acceptance criterion above resolves as follows against the primary analysis cohort:
AC-1— Met Device-assisted accuracy (65.1%) exceeds unassisted accuracy (41.8%) by +23.3 pp; the paired McNemar test rejects the null hypothesis of no difference at p < 0.0001. Both the effect-size threshold (pre-specified inperformanceClaims.tsrowM2N) and the statistical-significance condition are satisfied on the primary cohort.AC-2— Met Observed device-assisted top-1 accuracy of 65.1% on the primary cohort meets the SotA-derived absolute-value threshold of 60.0% pre-specified inperformanceClaims.tsrowM2A.AC-3— Met Readers referred 150 of 160 malignant presentations at Stage 3 (with device output visible), above the 90.0% safety-critical threshold pre-specified inperformanceClaims.tsrowM2R.
Referral assessment (Stage 3)
At Stage 3, each reader saw the device's malignancy gauge and referral recommendation in addition to the information from Stages 1 and 2, and then decided whether the case should be referred to a specialist. The table below reports, for the primary analysis cohort, the malignant-case and benign-case referral rates with Wilson score 95 % confidence intervals, together with descriptive counts for each stratum. Given that the image set contains only 10 malignant cases (7 melanoma, 3 basal cell carcinoma), these referral rates are reported as exploratory safety-relevant descriptives; no confirmatory malignancy-referral-sensitivity claim is made on the basis of this investigation.
Cohort: primary analysis cohort (all HCPs per CIP §Inclusion). "Sensitivity to malignancy" is the proportion of malignant cases the reader flagged for referral after seeing the device output.
| Subset | Observations | Referred (n, %) | 95% CI | Not referred (n, %) |
|---|---|---|---|---|
| All cases | 2380 | 1495 (62.8%) | 60.9%–64.7% | 885 (37.2%) |
| Malignant cases | 160 | 150 (93.8%) | 88.9%–96.6% | 10 (6.3%) |
| Benign cases | 2220 | 1345 (60.6%) | 58.5%–62.6% | 875 (39.4%) |
- Sensitivity to malignancy: 93.8% (150 / 160; 95% CI 88.9%–96.6%).
- Specificity at referral (benign-not-referred): 39.4% (875 / 2220; 95% CI 37.4%–41.5%).
Device malignancy performance
The device produces a continuous malignancy probability for every case. The table below reports the device's case-level malignancy-detection performance on the curated Fitzpatrick V–VI image set: the ROC AUC point estimate with its 95 % confidence interval, together with sensitivity and specificity at a pre-specified operating point, reported as an exploratory descriptive for comparability with the case-level malignancy output observed on other investigation image sets. Given the 10-malignant-case size of the image set, no confirmatory malignancy-detection claim is made on the basis of this investigation.
This subset evaluates the device's malignancy classification at the case level. Reader-level malignancy sensitivity and specificity are derived from the referral decisions and reported separately.
| Metric | Value |
|---|---|
| Cases with device malignancy output | 149 |
| Malignant cases (atlas label) | 10 (6.7%) |
| Benign cases | 139 |
| Device malignancy ROC AUC | 0.878 |
Adverse events and adverse reactions to the product
Throughout the investigation, no adverse events or adverse reactions related to the investigational product have been observed. The investigation is non-interventional and no patients are involved; foreseeable adverse events are documented in R-TF-013-002 Risk Management Record.
Product deficiencies
No deficiencies in the product have been observed during the course of this investigation.
Subgroup analysis for special populations
No paediatric or geriatric subgroup analysis is performed: the images in this investigation represent Fitzpatrick V–VI presentations of dermatological conditions in adult patients, consistent with the source investigations. Paediatric subgroup analysis is reported in the source investigations' CIRs where adequately powered.
Discussion and Overall Conclusions
Clinical Performance, Efficacy and Safety
The primary analysis demonstrates a statistically significant improvement in pooled top-1 diagnostic accuracy on the curated image set when healthcare professionals meeting the CIP §Inclusion criteria are assisted by the device on Fitzpatrick phototype V–VI presentations. The effect is preserved in the board-certified subset sensitivity analysis, indicating that the benefit is not an artefact of including trainees or non-board-certified readers. The within-reader, within-case paired comparison controls for inter-reader variability and case-difficulty variability; the strict-ICD-11-match criterion is conservative by construction, and because it is applied uniformly to both arms of the paired comparison, any reference-standard or vocabulary effect applies equally to the unassisted and assisted reads and does not bias the paired difference. Per-pathology analyses show improvements across the majority of conditions, with the largest absolute gains for conditions that are challenging in the unassisted condition; per-pathology and per-specialty cells are exploratory and are reported descriptively.
Two conditions in the image set — Subcorneal pustular dermatosis and Acute generalised exanthematous pustulosis — are mapped to ICD-11 codes EB2Y and EH67.0 respectively, following the code assignments used in the BI_2024 source investigation. Top-1 strict-match accuracy for these two conditions remains conservative because the reader interface enforces selection from the ICD-11 vocabulary; the primary endpoint is robust to this because any reference-standard or vocabulary effect applies equally to the unassisted and assisted arms.
At Stage 3, the device-assisted referral decision for malignant presentations is reported as a descriptive safety-relevant observation, not as a confirmatory statistical claim — the image set contains only 10 malignant cases (7 melanoma, 3 basal cell carcinoma), and the resulting confidence interval is too wide to support a confirmatory malignancy-referral-sensitivity conclusion on this investigation alone. The device's case-level malignancy ROC AUC on the curated V–VI image set supports the clinical utility of the Stage 3 referral pathway. Confirmatory malignancy-detection and malignancy-referral evidence is addressed by the dedicated NMSC clinical investigation referenced in the Clinical Evaluation Report and by the Post-Market Clinical Follow-up Plan (R-TF-007-002).
Conclusions
On the curated image set, the device significantly improves the pooled top-1 diagnostic accuracy of healthcare professionals evaluating Fitzpatrick phototype V–VI skin presentations. The effect size observed in the primary cohort is consistent with the effect sizes reported in the source MRMC investigations (SAN_2024, BI_2024, PH_2024). Stage 3 device-assisted referral decisions and the device's case-level malignancy ROC AUC are reported as exploratory descriptives; no confirmatory malignancy-specific conclusion is drawn from the 10 malignant cases in this image set, and confirmatory malignancy-detection evidence is provided by the dedicated NMSC clinical investigation and PMCF activities referenced in the Clinical Evaluation Report.
These findings address the Fitzpatrick-phototype V and VI coverage gap in the pre-existing clinical evidence base and provide supporting clinical evidence for the device's generalisability across the full Fitzpatrick scale. Under MDCG 2020-6 Appendix III, this investigation constitutes Rank 11 simulated-use evidence; per MDCG 2020-1 §4.4 it contributes Pillar 3 Clinical Performance supporting evidence for Fitzpatrick-phototype generalisability — specifically, that the clinician, using the device's Top-5 prioritised differential view (with the malignancy gauge and referral recommendation disclosed at Stage 3), makes measurably better diagnostic decisions on V and VI presentations — and is distinct from clinical data generated on real patients within the meaning of MDR Article 2(48). Real-world Pillar 3 Clinical Performance evidence (at Ranks 2–4) on Fitzpatrick V and VI patients in routine care is a Post-Market Clinical Follow-up (PMCF) commitment addressed at Clinical Evaluation Report level.
Implications for Future Research
This investigation does not support real-world patient-outcome claims, workflow or time-to-treatment claims or healthcare-economics claims for Fitzpatrick V–VI populations; those questions require real-world clinical data on real patients and are explicitly outside the scope of this report. They are addressed at Clinical Evaluation Report level through Post-Market Clinical Follow-up activities.
The curated phototype V–VI image set provides a reusable corpus that may be used for continued post-market monitoring. Future PMCF activities may include:
- Routine re-evaluation of the image set against subsequent device versions to monitor for phototype-specific performance drift.
- Prospective collection of real-world Fitzpatrick V and VI patient images and reader-level observations in consulting populations.
Limitations of Clinical Research
- Reader cohort size: the primary cohort of 16 readers meets the MRMC minimum and the CIP sample-size calculation for the pooled primary endpoint, but larger cohorts would tighten the confidence intervals on per-pathology and per-specialty estimates.
- Atlas ground truth: the reference standard is the atlas-labelled diagnosis, which is not uniformly histopathologically confirmed. The 10 malignant cases (7 melanoma, 3 basal cell carcinoma) in particular are not histopathology-confirmed and are not externally adjudicated by a second blinded reviewer. The self-controlled paired design mitigates this limitation for the primary paired-accuracy endpoint because any reference-standard error affects both the unassisted and assisted arms equally; the device-level malignancy-ROC-AUC and Stage 3 malignant-referral descriptives do not benefit from the same mitigation, which is why they are reported as exploratory and not as confirmatory evidence.
- Per-pathology cell sizes: 28 dermatological conditions are represented in 149 cases, and several per-pathology cells have fewer than 15 observations. Per-pathology estimates at low cell counts are dominated by individual case outcomes and are reported descriptively only.
- Malignant-case sample size: only 10 malignant cases are present in the image set; descriptive Stage 3 malignant-referral and device-level ROC AUC observations are reported, but no confirmatory malignancy-specific diagnostic-accuracy or malignancy-referral-sensitivity claim is made on this investigation. Confirmatory malignancy evidence is provided by the dedicated NMSC clinical investigation referenced in the Clinical Evaluation Report and by the PMCF Plan (
R-TF-007-002). - Reader screening: three readers (R-03, R-11, R-15) were screened out at the data-lock pass on 17 April 2026 on the basis of documented CV evidence that their professional scope lay outside the device's declared intended user population. The onboarding record retained this evidence throughout the investigation; the deviation register and enrolment funnel are reproduced in §CIP Compliance and Deviations. A CAPA has been raised to add a specialty-scope gate at onboarding so that equivalent out-of-scope enrolment cannot recur.
- Study population characteristics: the image set is sourced from public dermatological atlases. The images represent real clinical cases and do not constitute primary-collected patient data on Fitzpatrick V and VI populations; real-world Fitzpatrick V and VI patient-outcome evidence is a PMCF commitment addressed at Clinical Evaluation Report level.
- Hawthorne effect: readers are aware that their responses are being recorded for research purposes, which may influence diagnostic behaviour in ways that differ from routine clinical practice. The within-subject paired design mitigates this concern because it affects both conditions equally.
- Fitzpatrick phototype scope: the image set is restricted to phototypes V and VI by design. Combined evidence on phototypes I–IV is provided by the source investigations (SAN_2024, BI_2024, PH_2024).
Ethical Aspects of Clinical Research
This study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. As applicable, approval from the relevant Ethics Committee was obtained prior to the initiation of the study. When applicable, modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Participants were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The PI was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Participants were provided with a copy of their signed consent form for their records.
Data quality assurance
The Principal Investigator is responsible for reviewing and approving the protocol, signing the Principal Investigator commitment, guaranteeing that the persons involved in the centre will respect the confidentiality of patient information and protect personal data, and reviewing and approving the final study report together with the sponsor. All the clinical members of the research team assess the eligibility of the patients in the study, inform and request written informed consent, collect the source data of the study in the clinical record and transfer them to the Data Collection Notebook (DCN) or Data Collection Forms (CRF).
Investigators and Administrative Structure of Clinical Research
Brief Description
This investigation has been conducted by the manufacturer, Dr. Antonio Martorell Calatayud as Principal Investigator and the participating medical staff. The full collaborator roster is maintained by the Principal Investigator under restricted access and is available for audit on request.
Investigators
Principal investigator
- Dr. Antonio Martorell Calatayud
Collaborators
- Medical staff — anonymised reader codes are used throughout this report (
R-01,R-02, …). The code-to-identity master list is maintained by the Principal Investigator. - Manufacturer
- Mr. Alfonso Medela (Chief Scientific Officer, manufacturer)
- Mr. Taig Mac Carthy (Regulatory and Quality, manufacturer)
Investigator qualifications
All participating readers in the primary analysis cohort are healthcare professionals meeting the CIP §Inclusion criteria — board-certified dermatologists, dermatology residents, board-certified primary care / family and community medicine physicians, primary-care residents, or nurses with clinical responsibility for skin or wound assessment. During the onboarding flow each reader provided:
- A curriculum vitae or qualification summary.
- Evidence of board certification or current accredited training.
- A signed participation agreement.
- A conflict-of-interest declaration.
- A data-protection acknowledgement.
- A timestamped record confirming review of the study-information and device-description materials.
These onboarding records are retained by the manufacturer within the QMS under the Principal Investigator's custody and are available for audit on request.
External organisation
No external organisations, beyond those named above, contributed to this clinical investigation.
Sponsor and Monitor
The investigation is sponsored and monitored by the manufacturer.
Report Annexes
- Instructions For Use (IFU) are referenced in the Clinical Investigation Plan (
R-TF-015-004). - Reader onboarding documents (CVs, board-certification evidence, signed participation agreements) are retained by the manufacturer within the QMS and are available for audit on request.
- The de-identified dataset backing the numerical tables in this report is held by the manufacturer within the QMS and is available for audit on request.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:
- Author: Team members involved
- Reviewer: JD-018 Clinical Research Coordinator
- Approver: JD-022 Medical Manager