Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
    • CAPA Plan - BSI CE Mark Closeout
    • Index
    • Overview and Device Description
    • Information provided by the Manufacturer
    • Design and Manufacturing Information
    • GSPR
    • Benefit-Risk Analysis and Risk Management
    • Product Verification and Validation
      • Software
      • Artificial Intelligence
      • Cybersecurity
      • Usability and Human Factors Engineering
      • Clinical
        • Evaluation
          • Appendix
          • R-TF-015-001 Clinical Evaluation Plan
          • R-TF-015-003 Clinical Evaluation Report
          • R-TF-015-011 State of the Art Legit.Health Plus
          • R-TF-015-013 Statistical Summary of Clinical Evidence
        • Investigation
      • Commissioning
    • Post-Market Surveillance
  • Legit.Health Plus Version 1.1.0.1
  • Legit.Health version 2.1 (Legacy MDD)
  • Legit.Health US Version 1.1.0.0
  • Legit.Health Utilities
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • BSI Non-Conformities
  • Pricing
  • Public tenders
  • Trainings
  • Legit.Health Plus Version 1.1.0.0
  • Product Verification and Validation
  • Clinical
  • Evaluation
  • R-TF-015-001 Clinical Evaluation Plan

R-TF-015-001 Clinical Evaluation Plan

Table of contents
  • Executive Summary
    • How to read this CEP
    • Document Overview
    • Key Evaluation Objectives
    • Clinical Development Status
    • Regulatory Pathway
  • Purpose
  • Scope of the clinical plan as part of the clinical evaluation
  • Clinical Evaluation Strategy
    • Overview of the Chosen Approach
    • Route A: Systematic Literature Review
      • Regulatory basis
    • Route B: Equivalence with the Legacy Device
      • Regulatory basis
      • Load-bearing nature of this route
      • Article 61(5)(b) access condition
      • MDCG 2020-5 three-characteristic demonstration
      • Summary of the difference set between the device and the legacy predecessor
      • Inline summary equivalence table
      • Consequence for the clinical evidence portfolio
    • Route C: Own Clinical Investigations
      • Regulatory basis
      • Portfolio status at the date of this CEP
    • Summary of the Combined Strategy
      • Rank-8 primary classification of R-TF-015-012, with supplementary Rank-4 case
  • References
  • Acronyms and definitions
    • Acronyms
    • Definitions
  • Responsibilities: competence of the clinical evaluation team
    • Individual-experience baseline under MEDDEV 2.7/1 Rev 4 §6.4
    • MEDDEV 2.7/1 Rev 4 §6.4 four-competence coverage
    • External methodological review
    • Subject Matter Expert coverage across the indication scope
  • Identification of relevant product requirements
    • Coverage of additional GSPRs requiring clinical data for an MDSW
  • Description
    • Device identification
    • Manufacturer identification
    • Contraindications and precautions required by the manufacturer
      • Contraindications
      • Precautions
    • Warnings
    • Undesirable effects
    • Intended clinical benefits
      • Clarification on "Multiple conditions"
      • Evaluation of the intended clinical benefits
      • Integrator integration requirements as risk controls
    • Device classification
    • Product category
    • Device variants and packaging
    • Previous version of the device
    • Components
    • Mode of action
    • Device lifecycle
    • Expected lifetime
    • Degree of Novelty
      • Clinical or surgical procedure novelty dimensions
      • Device-related novelty dimensions
      • Novelty conclusion
    • Clinical Performance Claims
      • Pooled performance-metric methodology
  • Risk management
    • Device-specific hazards applicable to the device
    • Risk mitigation measures
    • Risk Summary
      • Total identified risks
      • Risk mitigation effectiveness
    • Safety endpoints
      • Dual anchoring of safety acceptance criteria
    • Acceptability of the benefit-risk ratio
      • Benefit-risk determination methodology
  • State of the art
    • Scope
    • Literature search
      • Literature search protocol
    • Source of data and search description
      • Refresh cadence
      • Vigilance databases
      • Registries
    • Selection Methodology and Criteria
    • Literature appraisal data
      • Appraisal plan
      • Appraisal and weighting criteria for the State of the Art literature (CRIT1-7)
      • Inclusion threshold and rationale
  • Clinical Development Plan
    • Purpose
    • Phased progression of the clinical evaluation
    • Current State of the Evidence
      • Non-clinical test results: bench testing
      • Existing clinical data
      • Clinical evidence assessment strategy
    • Regulatory pathway for the device and Article 120 status of the legacy device
      • MDR Article 83 framing for the legacy predecessor's post-market observational study
    • Guidance framework applied
      • Evidence quality hierarchy (MDCG 2020-6 Appendix III)
      • Three-pillar evidence framework for MDSW (MDCG 2020-1)
      • Planned tiered evidence assessment
      • Planned paediatric subgroup analysis
      • Planned evidence classification per study
      • Summary distribution of planned evidence across Rank and Pillar
      • Published severity validation literature (MDCG 2020-1, Pillar 2)
      • Published clinical performance literature (MDCG 2020-1, Pillar 3)
      • Planned appraisal methodology for clinical investigations and published manuscripts
    • Confirmatory phase (Pivotal Investigations)
      • Study-level vs. device-level acceptance criteria: reconciliation
    • PMS aspects that need regular updating in the clinical evaluation report
    • Post-Market Clinical Follow-up (PMCF)
  • Clinical Evidence
    • Appendix F appraisal of the legacy predecessor vigilance denominator
  • Clinical Concerns
    • Identification and Evaluation Process
    • Current Status
    • Mechanism for Future Updates
  • CEP Completeness Verification
    • Conclusion
  • Annexes

Executive Summary​

This Clinical Evaluation Plan (CEP) establishes the framework for evaluating the clinical safety, performance, and benefit-risk profile of the device under evaluation. The device is a Class IIb software-based medical device under MDR Rule 11 (Annex VIII). Applying MDCG 2019-11, the device's significance of information to a healthcare situation is "Drives clinical management" (IMDRF 5.1.2); its effect on patient health is "Critical" for melanoma (IMDRF 5.2.1) and "Serious" for the remaining 345 ICD-11 categories in the intended use (IMDRF 5.2.2); the governing IMDRF cell under Rule 11's highest-applicable-cell logic is Category III.i (Critical × Drives), which confirms the Class IIb classification. The device is intended to assist healthcare practitioners, as clinical decision support, in the assessment and severity characterisation of dermatological conditions; the healthcare practitioner retains responsibility for the clinical decision. The full MDCG 2019-11 classification analysis is documented in the R-TF Device Description and Specification record, subsection "Guideline MDCG 2019-11".

How to read this CEP​

The document proceeds from claims to evidence to gap handling. The device makes three consolidated clinical-benefit claims — 7GH (improved diagnostic accuracy), 5RB (objective severity assessment) and 3KX (optimised referral and remote-care workflow) — each anchored to testable device-level acceptance criteria (see "Acceptance-criteria derivation"). Those acceptance criteria are satisfied through three evidence routes: Route A (systematic literature review, Pillar 1 VCA and state-of-the-art anchor), Route B (equivalence to the legacy predecessor device under MDCG 2020-5, including the legacy passive PMS corpus at Rank 7 and the protocolled post-market observational study R-TF-015-012 at Rank 8 primary with a supplementary Rank 4 case for quantitative endpoints), and Route C (own prospective and simulated-use clinical investigations, Pillar 3 Clinical Performance). Every data source in the evidence portfolio is mapped to the MDCG 2020-6 Appendix III rank hierarchy and to the MDCG 2020-1 three-pillar framework at the "Clinical Evidence" table. Pre-market evidence gaps — where they exist by indication or by feature — are addressed by the ten Post-Market Clinical Follow-up activities listed in "Post-Market Clinical Follow-up (PMCF)" and in R-TF-007-002. The CEP is refreshed annually in alignment with the PSUR cycle, with additional unscheduled updates triggered by Article 87 / 88 / 95 events, algorithmic-performance threshold breaches, and new state-of-the-art evidence (see "Mechanism for Future Updates").

The six-phase Clinical Development Plan diagram under "Phased progression of the clinical evaluation" is the single-view map of how Phases 0, 2, 3, 4, 5, and 6 combine to populate the evidence hierarchy.

Document Overview​

  • Device: the device under evaluation (computational software-only medical device; see "Device identification" and "Intended Purpose" sections)
  • Classification: Class IIb per Rule 11, Annex VIII, MDR 2017/745
  • Regulatory Framework: EU Medical Device Regulation (MDR) 2017/745, MEDDEV 2.7/1 Rev.4
  • Scope: Stage 0 of the clinical evaluation process (scoping and planning)
  • Intended Purpose (MDR Article 2(12)): as declared in the Instructions for Use and rendered in full in the "Intended Purpose" subsection below; summarised here as a clinical decision support system to assist healthcare professionals in primary care and specialist dermatology in the assessment of dermatological conditions across the ICD-11 dermatological spectrum.
  • Intended users: healthcare professionals in primary care (PCPs) and specialist dermatology.
  • Intended patient populations: patients attending dermatological consultations across all age groups, skin phototypes (Fitzpatrick I–VI) and dermatological conditions within the 346 ICD-11 categorical set declared in the Intended Purpose.
  • Use environment: primary-care and specialist-dermatology clinical settings, including face-to-face and teledermatology workflows.
  • Intended clinical benefits: 7GH (improved diagnostic accuracy via the mandated Top-5 prioritised differential view, including malignancy surfacing), 5RB (objective, reproducible severity assessment) and 3KX (optimised referral, waiting-time and remote-care workflow) — each fully specified with testable acceptance criteria in the "Clinical Performance Claims" section.

Key Evaluation Objectives​

  1. Safety Assurance: Demonstrate that residual risks are acceptable and managed through appropriate controls
  2. Performance Validation: Confirm the device achieves intended clinical performance metrics across all user tiers (PCPs and dermatologists)
  3. Benefit-Risk Assessment: Establish that clinical benefits outweigh identified residual risks
  4. Conformity with GSPR: Provide evidence of compliance with General Safety & Performance Requirements #1, #8 and #17 as the principal clinical-data-bearing requirements; cross-reference the device-level evidence for additional GSPRs (3, 4, 9, 18, 22, 23) that bear on a Class IIb MDSW
  5. Clinical Evidence Synthesis: Integrate pre-clinical testing, literature evidence, and clinical investigation data

Clinical Development Status​

The pivotal investigation portfolio defined by this Plan comprises ten items plus one proof-of-concept pilot study: five prospective clinical investigations with a device algorithmically equivalent to the device under evaluation (frozen classifier architecture and weights; COVIDX_EVCDAO_2022, DAO_Derivación_O_2022, DAO_Derivación_PH_2022, IDEI_2023, MC_EVCDAO_2019), four multi-reader multi-case (MRMC) simulated-use reader studies (BI_2024, PH_2024, SAN_2024 — general dermatology; MAN_2025 — Fitzpatrick V-VI representativeness), and one retrospective manufacturer-authored peer-reviewed clinical-performance publication in an external specialist-clinic deployment context (NMSC_2025 — Medela et al., EAORL 2025; 135 patients; BCC / cSCC detection in a specialist head-and-neck clinic setting). The retrospective severity-assessment study AIHS4_2025 is carried as a proof-of-concept / pilot study rather than a pivotal contribution, given its very limited sample size (2 patients, 16 assessments); it contributes early-stage feasibility evidence for Benefit 5RB and is not counted in the ten pivotal items above. The legacy-device post-market observational study R-TF-015-012 enters this evaluation under Route B (equivalence) at Rank 8 per MDCG 2020-6 Appendix III (proactive PMS data); a supplementary case for Rank 4 classification under the Appendix III "high quality surveys may also fall into this category" note is presented as a secondary position in "Summary of the Combined Strategy". At the date of this CEP revision, all pre-market pivotal investigations listed in the Confirmatory Phase table have been executed under protocols consistent with this Plan; their resulting Clinical Investigation Reports contribute to the evidence base. Any PMCF-phase clinical investigations listed with status "Planned" that fall within the scope of MDR Article 2(45) are governed by this Plan for execution under ISO 14155:2020 as specified per investigation; Rank 11 simulated-use MRMC reader studies on retrospective anonymised images are outside the scope of MDR Article 2(45) and are governed by the CEP and by their respective CIPs without a dedicated ISO 14155:2020 regime. The cumulative portfolio addresses HS severity assessment, GPP diagnosis, malignancy detection (melanoma, BCC, cSCC), referral optimisation, remote care, rare disease recognition, darker-phototype coverage, and diagnostic-accuracy improvement with device assistance.

Regulatory Pathway​

This CEP supports the first CE-marking submission of the device under the MDR, as a new device succeeding the legacy predecessor device (MDD Class I, CE-marked since 2020) which remains on the market under MDR Article 120(3) transition provisions. The primary clinical-evaluation guidance applied is MDCG 2020-1 (MDSW three-pillar framework); the equivalence demonstration to the legacy device follows MDCG 2020-5 and MDR Article 61(5)-(6). See section "Regulatory pathway for the device and Article 120 status of the legacy device" for the full pathway framing.

Purpose​

Article 61(3) of the MDR 2017/745 states that a clinical evaluation must “follow a defined and methodologically sound procedure”, meaning that a Clinical Evaluation Plan needs to be established in advance and should define how the evaluation shall be conducted. The MDR Annex XIV Part A provides further details on requirements for CEP.

This Clinical Evaluation Plan (CEP) is dedicated to stage 0 of the clinical evaluation process, adhering closely to the requirements outlined in the Medical Devices Regulation (MDR) 2017/745 and the guideline on medical devices and clinical evaluation (MEDDEV 2.7/1 rev4).

The evaluation and determination of clinical data, studies and relevant observations to consider when providing and showing conformity with Regulation (EU) 2017/745 General Safety & Performance Requirements number #1, #8 and #17 (requiring support from the clinical data), has been performed according to:

  1. Requirements from Article 61 and Annex XIV from Regulation (EU) 2017/745.
  2. Recommendations from the MDCG guidelines MDCG 2020-13, MDCG 2020-1 and MDCG 2020-5.
  3. Recommendations from the MEDDEV 2.7/1 Rev.4 guideline.

The clinical evaluation that we develop and discuss through this Clinical Evaluation Plan (CEP) consolidates the evaluation of the collected clinical data related to the device under evaluation (hereinafter, "the device").

This Clinical Evaluation Plan has the following purposes:

  • Evaluating the conformity of the product towards the General Safety and Performance Requirements (GSPR) requiring clinical evidence from Regulation (EU) 2017/745.
  • Evaluating the clinical data compiled in order to demonstrate the valid clinical association, the technical performance, and the clinical performance of the MDSW.
  • Determining the acceptability of the benefit-risk ratio under the conditions of intended use and indications stated in the technical documentation.
  • Compiling all clinical data collected through clinical investigations, systematic bibliographic research of clinical literature, and information related to product risk management, with special emphasis on the identification and management of unknown risks and/or secondary effects.

This plan applies to the device. The device is classified as a class IIb medical device. The legacy predecessor device has been commercialized since 2020 under the Medical Devices Directive (MDD) 93/42/EEC. The device is manufactured under a Conformity Assessment based on a Quality Management System in accordance with Chapter I of Annex IX of Regulation (EU) 2017/745 Medical Devices.

This Clinical Evaluation Plan will be checked and, if necessary, updated at each milestone and/or review of the device's development and post-market lifecycle.

Scope of the clinical plan as part of the clinical evaluation​

The clinical evaluation is based on a comprehensive analysis of available pre- and post-market clinical data that is relevant to the intended purpose of the device, including clinical performance data and clinical safety data.

There are discrete stages in performing a clinical evaluation:

  • Stage 0: Define the scope, plan the clinical evaluation (also referred to as scoping and the clinical evaluation plan).
  • Stage 1: Identify pertinent data.
  • Stage 2: Appraise each individual data set, in terms of its scientific validity, relevance and weighting.
  • Stage 3: Analyze the data, whereby conclusions are reached about
    • compliance with general safety and performance requirements, including its benefit/risk profile,
    • the contents of information materials supplied by the manufacturer (including the label, IFU of the device, available promotional materials, including accompanying documents possibly foreseen by the manufacturer),
    • residual risks and uncertainties or unanswered questions (including rare complications, long-term performance, safety under widespread use), whether these are acceptable for CE-marking, and whether they are required to be addressed during PMS.
  • Stage 4: Finalize the Clinical Evaluation Report.

The arrow from Stage 4 back to Stage 0 represents the continuous-evaluation feedback loop required by MEDDEV 2.7/1 Rev 4 §7 and MDR Article 61(11): each completed clinical evaluation cycle (Stage 4) feeds new clinical data — primarily from PMS / PMCF activities and from updates to the State of the Art — back into the scope of the next cycle (Stage 0). The loop drives both scheduled annual updates of the CER (aligned with the PSUR cycle per MDR Article 86) and trigger-based unscheduled updates (any serious incident, any algorithmic-performance threshold breach per R-TF-007-002, or any new SotA evidence with the potential to change the current evaluation). The PMS / PMCF feedback loop is documented in section "PMS aspects that need regular updating in the clinical evaluation report" of this CEP and in the Clinical Evaluation Report.

Before a clinical evaluation is undertaken, the manufacturer should define its scope based on the General Performance and Safety Requirements that need to be addressed from a clinical perspective and the nature and history of the device.

The scope serves as a basis for further steps, including identifying pertinent data. The manufacturer sets up a description of the device under evaluation and a clinical evaluation plan.

Depending on the stage in the lifecycle of the product, considerations for setting up the Clinical Evaluation Plan should include the following different aspects:

  • The device description.
  • Whether there are any design features of the device, or any indications or target populations, require specific attention.
  • Information needed for evaluation of equivalence if equivalence may be claimed.
  • Information from similar devices from the market.
  • The risk management documents of the device.
  • The current knowledge/ state of the art in the corresponding medical field.
  • Data source(s) and type(s) of data to be used in the clinical evaluation.
  • Whether the manufacturer has introduced / intends to introduce any relevant changes.
  • Whether any specific clinical concerns have newly emerged and need to be addressed.
  • PMS aspects - recall from similar devices.
  • Needs for planning PMS activities: PMCF.

The scope of the CEP is limited to the device.

Clinical Evaluation Strategy​

Overview of the Chosen Approach​

In accordance with MDR Article 61(1), which establishes that the clinical evaluation shall be based on a sufficient amount of clinical data whose analysis allows an overall benefit-risk assessment and provides evidence of clinical safety and performance, we have adopted a combined clinical evaluation strategy. This approach draws on three complementary evidence routes, each grounded in specific provisions of MDR 2017/745 and its associated guidance.

The choice of a combined strategy is justified by the nature of the device, a Class IIb AI-based medical device software covering the full ICD-11 dermatological spectrum, and by the availability of relevant evidence from multiple sources. No single evidence route alone provides sufficient evidence to meet the clinical evaluation requirements for a Class IIb device of this breadth and complexity.

Route A: Systematic Literature Review​

Regulatory basis​

MDR Article 61(3)(a); Annex XIV Part A, Section 1(a).

A systematic review of the published scientific literature was conducted to serve three purposes:

  1. Establish the Valid Clinical Association (VCA) between the device's outputs (ICD-11 classifications, severity scores) and the corresponding clinical conditions, as required by MDCG 2020-1. VCA is established separately for each claimed output. Because the device covers 346 ICD-11 categories and seven epidemiological groups, VCA is structured as a grouped VCA matrix (rather than 346 individual VCAs): the matrix groups ICD-11 categories by the seven major epidemiological categories of dermatological disease (infectious 57%, other 19%, inflammatory 15%, malignant 5%, autoimmune 3%, genodermatoses 1%, vascular 1%); within each group the literature evidence supporting the visual-recognition validity of AI image analysis is appraised against the same CRIT1-7 framework. Severity-score outputs (PASI, UAS, IHS4, SCORAD, Ludwig and additional scales) carry an independent VCA per scale, supported by the peer-reviewed SotA literature that establishes each scale's clinical meaningfulness (interobserver reliability, responsiveness and validated link to treatment decisions). The Pillar 2 algorithm-validation publications (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022) are a separate Pillar 2 contribution — peer-reviewed evidence that the device's algorithm reproduces those scales against expert consensus — and do not serve as the Pillar 1 VCA anchor. The grouped VCA matrix and the per-scale VCA appraisal are documented in R-TF-015-011 State of the Art, section "Valid Clinical Association by epidemiological group and severity scale".
  2. Define the state of the art (SotA) for dermatological diagnosis and assessment, providing the quantitative reference benchmarks against which the device's clinical performance is evaluated.
  3. Identify supporting evidence on similar technologies and assess whether published data on comparable AI-based tools contributes additional clinical backing.
  4. Establish the surrogate-to-patient-outcome link for the device's claimed clinical benefits — specifically: (i) literature anchoring improved HCP diagnostic accuracy to downstream patient-outcome improvement, (ii) literature anchoring reduced referral time to reduced disease-progression risk, and (iii) literature anchoring reduced unnecessary referrals to reduced healthcare-system harm. This sub-stream discharges the Class IIb indirect-benefit defence for a clinical-decision-support MDSW whose Pillar 3 evidence is surrogate-endpoint-based. The surrogate-endpoint literature review is documented in R-TF-015-011 as a distinct section titled "Surrogate endpoint validity — by benefit domain" alongside the per-condition grouped VCA matrix and the per-scale severity VCA. It is governed by a dedicated PICO (Patient: patients assessed with AI-based dermatology decision support; Intervention: improved HCP diagnostic accuracy / reduced referral delay / reduced unnecessary referrals; Comparator: the respective unaided HCP baselines reported in the same literature; Outcome: patient-level clinical outcomes including delayed-diagnosis rate, disease-progression rate, avoidable-biopsy rate, specialist-care-access time). Inclusion-exclusion and CRIT1-7 appraisal are calibrated to surrogate-to-outcome study designs (cohort, registry, modelled economic-outcome, health-services research) as well as to diagnostic-accuracy designs.

Scope of this sub-stream: the outcomes listed in the PICO above (delayed-diagnosis rate, avoidable-biopsy rate, disease-progression rate, specialist-care-access time) are the literature anchor outcomes for the surrogate-to-patient-benefit link; they are NOT Route C device-level endpoints. The device's Pillar 3 evidence in Route C measures surrogates — diagnostic accuracy at the Top-5 prioritised differential view, referral adequacy and waiting-time reduction — and the Pillar 1 literature sub-stream closes the surrogate-to-outcome gap by reference to the peer-reviewed corpus. Real-world patient-outcome claims (delayed-diagnosis rate, avoidable-biopsy rate in the device's deployment context) are confirmed post-market under the PMCF programme in R-TF-007-002, not measured as pre-market device-level endpoints.

The literature review concluded that existing published evidence supports the clinical background and validates the device's claimed outputs as clinically meaningful, but is insufficient on its own to demonstrate the clinical performance and safety of this specific device in its intended real-world clinical context. The literature therefore contributes primarily at evidence Ranks 6 and 7 of the MDCG 2020-6 Appendix III hierarchy. Full methodology and results are documented in R-TF-015-011 State of the Art.

Route B: Equivalence with the Legacy Device​

Regulatory basis​

MDR Article 61(5)-(6); Annex XIV Part A, Section 3; MDCG 2020-5.

Load-bearing nature of this route​

Under the regulatory pathway described above, legacy clinical data enters the device's clinical evaluation only through the MDCG 2020-5 equivalence framework and MDR Article 61(5)-(6). The equivalence demonstration is therefore not a formality — it is the structural carrier of the legacy pivotal investigations, the Rank 7 passive PMS corpus and the R-TF-015-012 post-market observational study into the device's evidence portfolio.

Article 61(5)(b) access condition​

We manufacture both the device and the legacy predecessor and therefore have full access to the legacy predecessor's technical documentation, clinical investigation records, and post-market surveillance data. MDR Article 61(5)(b) is satisfied.

MDCG 2020-5 three-characteristic demonstration​

A formal equivalence assessment has been conducted against the three characteristic families of MDCG 2020-5 §A2.1:

  1. Technical characteristics: same design, conditions of use, specifications and properties (including AI model architecture and weights, training data, output space, deployment method, operating environment, integration interfaces), same principles of operation and same critical performance requirements. Differences between the device and the legacy predecessor are enumerated into two buckets (see summary table below) with per-difference clinical-relevance assessment.
  2. Biological characteristics: not applicable — the device is a software-only medical device with no physical contact with the human body.
  3. Clinical characteristics: same clinical condition (incl. severity and stage), same site of use, same intended population, same kind of user, and similar critical clinical performance for the intended purpose; any residual differences assessed for clinical significance.

Summary of the difference set between the device and the legacy predecessor​

All known differences between the device and the legacy predecessor are categorised into two buckets:

  • Bucket A (architectural and deployment changes with no clinical pathway): transition to a microservices architecture, HL7 FHIR interoperability, database encryption at rest, and logging/monitoring improvements. Covered by equivalence; supported by software-architecture, interoperability and cybersecurity V&V records (no change to clinical output).
  • Bucket B (algorithmically equivalent features): items 5 (ICD-11 classifier) and 8 (DIQA thresholds) are verified by bit-level engineering output-parity against the legacy predecessor's reference outputs; items 6 (malignancy-surfacing safety indicators) and 7 (clinical sign measurement models) are specification refinements that introduce no new capability type, covered by equivalence under MDCG 2020-5 §A2.1 on the basis of no clinical-performance divergence on the shared indication set (documented in V&V records and severity-validation publications).

Inline summary equivalence table​

The per-row detailed analysis is maintained in R-TF-015-003 under the section "Demonstration of equivalence — Per-change clinical-relevance assessment"; the summary table below provides an overview of the methodology applied.

Characteristic familyCharacteristicLegacy predecessor valueDevice valueBucketClinical relevanceEvidence source
TechnicalAI model — ICD-11 classifierFrozen architecture and weightsSame architecture and weights (frozen)BNot relevant (output-parity verified)V&V records (software and AI)
TechnicalSoftware architectureMonolithicMicroservicesANot relevant (no clinical output change)Software Architecture Description
TechnicalData-exchange formatProprietary JSONHL7 FHIRANot relevant (format only; same payload)Interoperability V&V
TechnicalEncryption at rest—AES-256ANot relevant (security, not clinical)Cybersecurity records
TechnicalMalignancy-surfacing safety indicatorsBinary malignancy-surfacing safety indicators (same capability set as the device)Six binary safety indicators (refinement and clarification of legacy indicators; no new capability type introduced)BEquivalent; refinement of capability present in legacy predecessor; covered by equivalence (MDCG 2020-5 §A2.1)V&V records; risk controls (R-TF-013-002)
TechnicalClinical sign measurement modelsClinical sign quantification models (erythema, induration, scaling, and related clinical signs) as implemented in the legacy predecessorSame clinical sign quantification models (refined); outputs used by clinicians to derive severity scores such as PASI, SCORAD, IHS4 and SALTBEquivalent; same measurement capability as legacy predecessor; covered by equivalence (MDCG 2020-5 §A2.1)Per-model V&V records
TechnicalDIQA thresholdsThresholds as applied in the legacy predecessorThresholds re-applied unchangedBEquivalent; same thresholds as legacy predecessor; covered by equivalence (MDCG 2020-5 §A2.1)DIQA V&V
BiologicalAllN/A (software only)N/A——Software-only justification statement
ClinicalIntended clinical conditionVisible dermatological conditionsVisible dermatological conditionsSameEquivalentIntended-Purpose section of R-TF Device Description and Specification
ClinicalIntended patient population346 ICD-11 dermatological categories (identical to the device), including malignant, rare, autoimmune, and genodermatoses subgroups, across Fitzpatrick I–VI phototypes346 ICD-11 dermatological categories, including malignant, rare, autoimmune, and genodermatoses subgroups, across Fitzpatrick I–VI phototypesSameEquivalent; same 346-category indication scope by design; no new indications introduced (MDCG 2020-5 §A2.1)Intended-Purpose section of R-TF Device Description and Specification
ClinicalIntended userHCPs (PCPs + dermatologists)HCPs (PCPs + dermatologists)SameEquivalent—
ClinicalSite of usePrimary care + dermatologyPrimary care + dermatologySameEquivalent—
ClinicalCritical clinical performanceLegacy per-metric clinical-performance figures as reported in the legacy umbrella PMS Report (R-TF-007-003) and the legacy post-market observational study (R-TF-015-012)Device per-metric acceptance criteria declared in the "Clinical Performance Claims" section above (7GH / 5RB / 3KX) with per-metric deltas against the legacy figures documented in R-TF-015-003 §"Per-change clinical-relevance assessment"AssessNon-inferiority / equivalence per metricR-TF-015-003 §"Per-change clinical-relevance assessment"

Consequence for the clinical evidence portfolio​

The established equivalence (Buckets A and B) allows the legacy predecessor's PMS data — including over 250,000 clinical reports, 21 contracts, and four or more years of continuous commercial deployment — to be incorporated into this clinical evaluation per MDCG 2020-5 and Article 61(5)-(6). Route B therefore contributes: (a) the passive legacy-device PMS dataset (Rank 7 per MDCG 2020-6 Appendix III) — complaints, incidents, vigilance reports, and trend analyses consolidated in the legacy umbrella PMS Report (R-TF-007-003), which is the Report paired with the legacy umbrella PMS Plan (R-TF-007-005) — providing real-world safety confirmation; (b) the protocolled post-market observational study of the legacy predecessor (R-TF-015-012, a study-specific Protocol nested inside the legacy umbrella PMS Plan) under MDR Article 83 / Article 85 applicable via MDR Article 120(3), contributing quantitative clinical-performance outcomes; and (c) professional-opinion Likert data from the same study, contributing supporting evidence. The evidence-rank classification of R-TF-015-012 is discussed in the evidence-hierarchy section below.

Route C: Own Clinical Investigations​

Regulatory basis​

MDR Article 61(4); Articles 62-82; Annex XV.

The literature review identified a gap in direct clinical evidence for this specific device in a real-world clinical environment with its intended users. The legacy predecessor's MDD Class I evidence, while applicable through the established equivalence, does not on its own satisfy the higher evidence requirements for a Class IIb MDR device. Accordingly, this Clinical Evaluation Plan specifies a pivotal investigation portfolio designed to generate direct clinical evidence for the device under evaluation.

Portfolio status at the date of this CEP​

A subset of the pivotal investigations planned in Route C were executed at earlier dates under protocols consistent with this Plan; the resulting Clinical Investigation Reports are incorporated into this evaluation as already-completed pivotal investigations contributing to the evidence base. The remaining investigations are defined below with their planned design, endpoints and acceptance criteria, and are executed under the current Plan governance prior to or within the PMCF phase as applicable. Each earlier Clinical Investigation Plan has been back-reviewed against this CEP at the current revision; protocol consistency is recorded in the CER (R-TF-015-003, section "Per-study evidence appraisal") per investigation, including any scope or endpoint delta relative to this Plan and its impact on the pivotal-set contribution.

Investigation status at the date of this CEPContribution to the evidence base
Already-completed pivotal investigationsClinical Investigation Reports filed under R-TF-015-006; CIPs and protocols archived under R-TF-015-004; appraisal per this Plan. Applied to the device under MDR either directly (frozen version) or through MDCG 2020-5 equivalence to the legacy predecessor (MDR Article 61(5)-(6)).
Planned pivotal investigations (pre-market)CIPs drafted or in preparation under R-TF-015-004; investigations executed prior to MDR certification where required by this Plan to meet Pillar 3 Clinical Performance sufficiency for specific indications or populations (see tiered evidence assessment and the acceptable-gap analyses).
PMCF-phase investigationsCaptured under R-TF-007-002 Post-Market Clinical Follow-up (PMCF) Plan with pre-specified enrolment targets, thresholds and trigger conditions; used to confirm (not fill) the pre-market sufficiency determination per MDCG 2020-6 §6.3 and §6.4.

These investigations collectively provide:

  • Pillar 3 Clinical Performance evidence (Ranks 2 and 4, primary stream): prospective studies in real clinical settings with real patients, covering the primary care to dermatology referral pathway, remote monitoring, malignancy detection, and severity assessment across over 700 patients at 6 hospital sites in Spain.
  • Pillar 3 Clinical Performance evidence (Rank 11, supporting stream): MRMC simulated-use studies demonstrating that intended users (dermatologists, primary care physicians, nurses) achieve clinically relevant outputs when using the device on images representative of the intended patient population, across diverse HCP tiers and dermatological conditions. Classified under Pillar 3 per MDCG 2020-1 §4.4 at a lower evidence rank than the prospective real-patient studies.

Route C is a Pillar 3 Clinical Performance route. The Pillar 2 Technical Performance anchor for the three-pillar framework is discharged separately by two complementary sources outside Route C: (i) AI model verification-and-validation against the manufacturer's curated labelled image database covering the 346-ICD-11 stand-alone analytical output space (Phase 0, documented in R-TF-028-005), and (ii) the peer-reviewed severity-validation publications APASI_2025, AUAS_2023, AIHS4_2023 and ASCORAD_2022 (Phase 6), which validate the device's severity-scoring outputs against independent expert dermatologist consensus. The 346-ICD-11 stand-alone analytical claim is anchored by (i); the four severity-scale analytical claims are anchored by (i) and (ii) together.

All pivotal investigations — already completed or planned — are designed and conducted in accordance with ISO 14155:2020, registered in public databases (ClinicalTrials.gov and EMA RWD Catalogue), and approved by the relevant ethics committees (CEIm). Full details are provided in the Confirmatory Phase section and in the individual Clinical Investigation Plans (R-TF-015-004). Each investigation listed in the Confirmatory Phase table carries an explicit "Already completed" or "Planned" status marker at the date of this CEP.

Summary of the Combined Strategy​

Evidence RouteRegulatory BasisContribution to Clinical Evaluation
A. Literature reviewArt. 61(3)(a); Annex XIV Part A §1(a)VCA establishment; SotA benchmarks; supporting evidence on similar technologies (Ranks 6-7)
B. Equivalence with legacy deviceArt. 61(5-6); Annex XIV Part A §3; MDCG 2020-5; MDCG 2020-6 §6.2.2Legacy passive-PMS data (Rank 7); protocolled post-market observational study R-TF-015-012 classified at Rank 8 (proactive PMS data per MDCG 2020-6 Appendix III, applied to both the quantitative endpoints and the Likert professional-opinion items); supplementary case for Appendix III Rank 4 classification of the quantitative endpoints (see subsection below); technical and clinical continuity confirmation (Rank 5)
C. Own clinical investigationsArt. 61(4); Arts. 62-82; Annex XVPrimary Pillar 3 Clinical Performance (Ranks 2 and 4, prospective real-patient pivotal investigations); supporting Pillar 3 Clinical Performance (Rank 11, MRMC simulated-use under MDCG 2020-1 §4.4)

The combined strategy ensures that the clinical evidence is sufficient, in quality and quantity, to demonstrate conformity with GSPR 1, 8, and 17, as required by MDR Article 61(1) and MDCG 2020-6 § 6.4. Sufficiency spans pre-market streams (Routes A and C) and post-market streams (Route B, including both passive surveillance at Rank 7 and the protocolled post-market observational study at Rank 8 per MDCG 2020-6 §6.2.2 and Appendix III). The execution of this strategy and its conclusions are presented in the Clinical Evaluation Report (R-TF-015-003).

Rank-8 primary classification of R-TF-015-012, with supplementary Rank-4 case​

The primary classification of R-TF-015-012 (the protocolled legacy-device post-market observational study) is Rank 8 per MDCG 2020-6 Appendix III, applied to both the quantitative endpoints and the Likert professional-opinion items. Rank 8 — "proactive PMS data, such as that derived from surveys" — is the conservative reading of the study's cross-sectional, physician-recall design: the cross-sectional, physician-recall design is the decisive classification axis under the strict Appendix III reading of rank as a design-strength axis separate from methodological-quality appraisal. The study is executed under the manufacturer's standing MDR Article 83 post-market surveillance obligation (applicable to the legacy predecessor via MDR Article 120(3)), and its physician-reported outcome design places it within the Appendix III Rank 8 category by default. We adopt Rank 8 as the primary classification to avoid contested classification at audit; the supplementary Rank 4 case set out below rests on methodological-quality features that Appendix III recognises as optionally elevating high-quality surveys, and the Pillar 3 sufficiency determination is unchanged under either classification (see paragraph immediately below and "Pillar 3 sufficiency under the Rank 8 primary classification").

Pillar 3 sufficiency under the Rank 8 primary classification: the Pillar 3 Clinical Performance sufficiency determination of this Plan is closed by convergence of Route C Ranks 2-4 prospective real-patient evidence with Rank 11 Pillar 3 §4.4 MRMC supporting evidence; R-TF-015-012 at Rank 8 adds real-world post-market confirmation consistent with MDCG 2020-6 §6.2.2 but does not carry the primary Pillar 3 load. Route B's contribution to this Plan, under the Rank 8 primary classification, is therefore the Rank 7 passive PMS corpus plus the Rank 8 post-market observational study — both of which remain valid under MDCG 2020-6 §6.2.2 and MDR Article 120(3).

Case for a higher rank (Rank 4) for the R-TF-015-012 quantitative endpoints: as a supplementary position (not the primary classification), the quantitative endpoints of R-TF-015-012 may be considered for Rank 4 under the MDCG 2020-6 Appendix III note that "high quality surveys may also fall into this category." The study meets that high-quality-survey bar on the following methodological features, each of which is documented in the study protocol:

  • Prospective, protocol-driven design with pre-specified endpoints, MCIDs and SotA comparators.
  • Pre-specified statistical analysis plan, including Holm-Bonferroni multiple-comparison correction and a sensitivity analysis on the pre-specified thresholds.
  • Protocol-declared safety items (Section F, F1-F4) evaluated independently of the Likert professional-opinion items.
  • Independent physician-reported outcomes with protocol-defined inclusion criteria, participant bounds and data-quality exclusions.
  • Prior scoping review under the legacy predecessor's 2025-2026 surveillance cycle authorised the study as part of the cycle under MDR Article 83 / Article 120(3).

References​

Reference document codeReference document description
MDR 2017/745Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices
MEDDEV 2.7/1 revision 4European Commission Guidelines on Medical Devices Clinical Evaluation
IMDRF/AE WG/N43FINAL:2020IMDRF terminologies for categorized Adverse Event Reporting (AER): terms, terminology structure and codes
MDCG 2023-3Questions and Answers on vigilance terms and concepts as outlined in the Regulation (EU) 2017/745 on medical devices (referenced from the PMS / PMCF activities of R-TF-007-001 and R-TF-007-002 for the alignment of vigilance terminology used in this CEP)
2023/C 163/06Commission Guidance on the content and structure of the summary of the clinical investigation report
MDCG 2020-10/1 Rev. 1 / MDCG 2020-10/2 Rev. 1Guidance on safety reporting in clinical investigations / Appendix: Clinical investigation summary safety report form
MDCG 2020-1Guidance on clinical evaluation (MDR) / Performance evaluation (IVDR) of medical device software
MDCG 2022-21Guidance on Periodic Safety Update Report (PSUR) according to Regulation (EU) 2017/745 (MDR)
MDCG 2020-6Regulation (EU) 2017/745: Clinical evidence needed for medical devices previously CE marked under Directives 93/42/EEC or 90/385/EEC
MDCG 2020-7Guidance on PMCF plan template
MDCG 2020-8Guidance on PMCF evaluation report template
IMDRF MDCE WG/N65FINAL:2021Post-Market Clinical Follow-Up Studies
MDCG 2020-13Clinical evaluation assessment report template
IMDRF MDCE WG/N56FINAL:2019Clinical evaluation
IMDRF MDCE WG/N55 FINAL:2019Clinical evidence
ISO 13485:2016, Adm 11Quality Management Systems - Regulatory Requirements for Medical Devices
ISO 14971:2019Medical devices - Application of Risk Management to Medical Devices
ISO 14155:2020Clinical Investigation on Medical devices for human subjects - Good clinical Practice
IEC 82304-1:2017Part 1: General requirements for product safety
UNE-EN 62304:2007/A1:2016 (EN 62304:2006/A1:2015)Medical device software - Software life-cycle processes

Acronyms and definitions​

Acronyms​

AcronymsDefinition
AIArtificial Intelligence
AUCArea Under the Curve (ROC)
BCCBasal Cell Carcinoma
CEPClinical Evaluation Plan
CERClinical Evaluation Report
CIPClinical Investigation Plan
CIRClinical Investigation Report
cSCCCutaneous Squamous Cell Carcinoma
EU/ECEuropean Union / European Community
EMDNEuropean Medical Devices Nomenclature
FSCAField Safety Corrective Action
GSPRGeneral Safety and Performance Requirement
HCPHealthcare Professional
ICCIntraclass Correlation Coefficient
ICDInternational Classification of Diseases
IFUInstructions for Use
KappaCohen's Kappa coefficient (agreement statistic)
MCIDMinimal Clinically Important Difference
MDDMedical Devices Directive
MDRMedical Device Regulation
MDSWMedical Device Software
MRMCMulti-Reader Multi-Case
NMSCNon-Melanoma Skin Cancer
NPVNegative Predictive Value
PCPPrimary Care Physician
PICOPopulation / People / Patient / Problem, Interventions, Comparison and Outcome
PMSPost-Market Surveillance
PMCFPost-Market Clinical Follow Up
PPVPositive Predictive Value
QMSQuality Management System
RMAERelative Mean Absolute Error
ROCReceiver Operating Characteristic (curve)
SaMDSoftware as a Medical Device
SotAState of the Art
SRNSingle Registration Number
UDI/DIUnique Device Identification / Device Identifier
VCAValid Clinical Association
ViTVision Transformer

Definitions​

The clinical evaluation is using conventional terms defined and used in the reference texts. Some terms, particularly important in the context of clinical evaluation, are defined below.

  • Benefit / Risk Determination: The analysis of all assessments of benefit and risk of possible relevance for the use of the device for the intended purpose, when used in accordance with the intended purpose given by the manufacturer (MDR, Article 2(24)).
  • Clinical Data: Information concerning safety or performance that is generated from the use of a device and is sourced from the following: clinical investigation(s) of the device concerned, clinical investigation(s) or other studies reported in scientific literature of a device for which equivalence to the device in question can be demonstrated, reports published in the peer-reviewed scientific literature on other clinical experience of either the device in question or a device for which equivalence to the device in question can be demonstrated, clinically relevant information coming from post-market surveillance, in particular, the post-market clinical follow-up (MDR).
  • Clinical Development Plan: A plan that indicates progression from exploratory investigations, such as first-in-man studies, feasibility and pilot studies, to confirmatory investigations, such as pivotal clinical investigations, and a PMCF for a device or product under evaluation (MDR Annex XIV PART A 1(a)).
  • Clinical Evaluation: A systematic and planned process to continuously generate, collect, analyze and assess the clinical data pertaining to a device in order to verify the safety and performance, including clinical benefits, of the device when used as intended by the manufacturer (MDR). A methodologically sound ongoing and continuous procedure to collect, appraise, and analyze clinical data pertaining to a medical device and to analyze whether there is sufficient clinical evidence to confirm compliance with relevant Essential Requirements or Principles for clinical safety and performance of that device when used in accordance with the manufacturer's instructions for use (MEDDEV 2.7/1).
  • Clinical Evidence: Clinical data and clinical evaluation results pertaining to a device of sufficient amount and quality to allow a qualified assessment of whether the device is safe and achieves the intended clinical benefit(s) when used as intended by the manufacturer (MDR).
  • Clinical Performance: Article 2 (52) MDR defines clinical performance as the ability of a device, resulting from any direct or indirect medical effects which stem from its technical or functional characteristics, including diagnostic characteristics, to achieve its intended purpose as claimed by the manufacturer, thereby leading to a clinical benefit for patients, when used as intended by the manufacturer (MDCG 2020-1).
  • Clinical Safety: Freedom from unacceptable clinical risks, when using the device per the manufacturer's Instructions for Use (MEDDEV 2.7/1). According to MDR, Article 62 (1), the clinical investigation shall be designed with the purpose of verifying the clinical safety of the device and to determine any undesirable side-effects, under normal conditions of use of the device, and assess whether they constitute acceptable risks when weighed against the benefits to be achieved by the device (MDR).
  • Intended Purpose: The use for which a device is intended according to the data supplied by the manufacturer on the label, in the instructions for use or in promotional or sales materials or statements and as specified by the manufacturer in the clinical evaluation (MDR).
  • Post-Market Clinical Follow-Up (PMCF) Plan: A PMCF plan shall specify the methods and procedures established by the manufacturer to proactively collect and evaluate clinical data from the use of a CE-marked medical device in or on humans, placed on the market or put into service within its intended purpose, as referred to in the relevant conformity assessment procedure. The PMCF plan aims to confirm the safety (including the acceptability of identified risks, particularly residual risks) and performance, including the clinical benefit if applicable, of the device throughout its expected lifetime; identify previously unknown side effects; and monitor the identified side effects and contraindications. Identifying and analyzing emergent risks on the basis of factual evidence; ensuring the continued acceptability of the benefit-risk ratio, referred to in Sections 1 and 9 of Annex I in the MDR. Identifying possible systematic misuse or off-label use of the device, with a view to verifying that the intended purpose is correct.
  • PMCF Study: A study carried out following marketing authorization intended to answer specific questions (uncertainties) relating to safety, clinical performance and/or effectiveness of a device when used in accordance with its labelling (IMDRF MDCE WG/N65FINAL:2021).
  • Post-Market Surveillance (PMS): All activities carried out by manufacturers in cooperation with other economic operators to institute and keep up to date a systematic procedure to proactively collect and review experience gained from devices they place on the market, make available on the market or put into service for the purpose of identifying any need to immediately apply any necessary corrective or preventive actions (MDR).
  • Risk: Combination of the probability of occurrence of harm and the severity of that harm (MDR).
  • Risk Management: Systematic application of management policies, procedures and practices to the tasks of analyzing, evaluating, controlling and monitoring risk. Risk assessments should document intended as well as reasonably foreseeable misuse (ISO 14971).
  • State of the Art: Developed stage of current technical capability and/or accepted clinical practice in regard to products, processes and patient management, based on the relevant consolidated findings of science, technology and experience. Note: the STATE OF THE ART embodies what is currently and generally accepted as good practice in technology and medicine. The state of the art does not necessarily imply the most technologically advanced solution. The STATE OF THE ART described here is sometimes referred to as the "generally acknowledged STATE OF THE ART".
  • Safety confirmation: The evidentiary contribution, from any source in the clinical evaluation, that demonstrates the absence of unacceptable residual clinical risk and the acceptability of observed adverse-event and device-failure rates during intended use. It is the safety half of the benefit-risk determination required by MDR Article 61(1) and Annex I §§1, 3, 4 and 8; the safety appraisal dimension described in MDCG 2020-6 §§6.1 and 6.3; the safety endpoint appraisal described in MEDDEV 2.7/1 Rev 4 §A7.2 (clinical risks and undesirable side-effects) and §A7.4 (acceptability of undesirable side-effects); and the residual-risk evaluation and production / post-production information loop of ISO 14971:2019 §§7, 8 and 10. Post-market streams contributing to this dimension are additionally governed by MDR Articles 83, 86, 87 and 88. Safety confirmation is orthogonal to the three MDCG 2020-1 MDSW evidence pillars (Valid Clinical Association, Technical / Analytical Performance, Clinical Performance): those pillars address whether the device produces clinically meaningful outputs; safety confirmation addresses whether the device, in doing so, does not introduce unacceptable harm. A source contributes safety-confirmation evidence if and only if it (a) pre-specifies safety-relevant outcome collection — adverse events, device-related harm, usability-related incidents, residual-risk observations — and (b) reports those outcomes with denominators. Pure performance studies without pre-specified safety-data collection do not contribute safety-confirmation evidence; legacy passive post-market surveillance corpora and clinical-investigation sections that pre-specify safety endpoints do.
  • Technical Performance: Capability of a MDSW to accurately and reliably generate the intended technical/analytical output from the input data (MDCG 2020-1).
  • Valid Clinical Association: Means the association of an MDSW output with a clinical condition or physiological state (MDCG 2020-1).

Responsibilities: competence of the clinical evaluation team​

The clinical evaluation should be conducted by a suitably qualified individual or a team.

In accordance with Regulation (EU) 2017/745, Annex XIV, Part A, section 1(d), the following section provides information about the team responsible for the preparation of the Clinical Evaluation Report (CER), along with justification of their qualifications and suitability to conduct the clinical evaluation.

NameRole in the EvaluationAcademic DegreeRelevant ExperienceJustification of Suitability
Jordi BarrachinaClinical Affairs ManagerPhD5 yearsExperience in clinical research and the design of clinical validations for medical devices; familiar with literature review and medical writing under MDR regulatory requirements.
Ana VidalIndependent ReviewerMSc6 yearsHer extensive background provides the necessary expertise, encompassing a deep understanding of the MDR regulatory framework, proficiency in clinical evaluation methodology (including systematic literature reviews and benefit-risk analysis), and the technical competence to assess Software as a Medical Device (SaMD).
Saray UgidosQuality/Regulatory ReviewerMSc8 yearsOver 8 years of experience in the medical device industry, with a strong understanding of regulatory requirements and clinical best practices, comprehensive knowledge of the product and its clinical context, relevant academic and regulatory training, and proven experience in drafting clinical evaluation documents in compliance with MDR.
Antonio MartorellSubject Matter Expert (SME)PhD, MD17 yearsThe SME is a board-certified dermatologist with 17 years of clinical experience, possessing extensive knowledge of the device's intended use and therapeutic context, along with strong expertise in evaluating clinical evidence, and a proven track record in contributing to clinical evaluations in accordance with MDR requirements.

All team members have been selected based on their academic background, clinical expertise, and regulatory experience relevant to the medical device under evaluation. Supporting evidence is provided in Annex I, including:

  • CVs of each team member.
  • Declarations of Potential Conflict of Interest.

Individual-experience baseline under MEDDEV 2.7/1 Rev 4 §6.4​

MEDDEV 2.7/1 Rev 4 §6.4 states that clinical evaluators should, as a general rule, have five years of documented professional experience (or ten years if the evaluator does not hold a higher-education degree in the respective field). Every member of the clinical evaluation team meets this baseline: the Clinical Affairs Manager (PhD, 5 years), the Independent Reviewer (MSc, 6 years), the Quality / Regulatory Reviewer (MSc, 8 years) and the Subject Matter Expert (PhD MD, 17 years). The collective team substantially exceeds §6.4 expectations in both individual experience and cross-domain coverage.

MEDDEV 2.7/1 Rev 4 §6.4 four-competence coverage​

The team collectively covers the four competence domains required by MEDDEV 2.7/1 Rev 4 §6.4 for clinical evaluators:

  1. Research methodology (clinical investigation design and biostatistics): covered by the Clinical Affairs Manager (PhD, 5 years), the Independent Reviewer (MSc, 6 years) and the Subject Matter Expert (PhD MD, 17 years).
  2. Information management (scientific background or librarianship qualification; database experience): covered by all four team members.
  3. Regulatory requirements: covered by the Quality / Regulatory Reviewer (MSc, 8 years), the Clinical Affairs Manager and the Independent Reviewer.
  4. Medical writing and clinical-data appraisal: covered by all four team members; the Subject Matter Expert provides the medical-knowledge anchor.

External methodological review​

Independently of the named evaluation team, we engaged Horiana (Health Data Consulting) to perform an external methodological review of the Clinical Evaluation Plan, the Clinical Evaluation Report and the nine completed clinical investigations that were available at the 17 April 2026 review cut-off. The review was conducted by Céline Fabre (Head of Biostatistics, Horiana) together with Antoine Giraud, Coralie Cantarel and Fabienne Diaz (biostatisticians, Horiana), and delivered on 17 April 2026 as two formal Horiana deliverables — a methodological-expertise report and a complementary recommendations document, both at version V1.0 dated 2026-04-17 — which are retained in the Quality Management System. A follow-up alignment call was held on 20 April 2026 to walk through the deliverables, discuss the findings, and record the methodological positions the evaluation team elected to adopt. Two evidence sources finalised after the external review cut-off — the Fitzpatrick V–VI multi-reader multi-case reader study (MAN_2025, data lock 17 April 2026) and the legacy-device post-market observational study R-TF-015-012 — were not part of the external review sample; their appraisal and pillar mapping are performed by the named evaluation team.

The scope of the external review addressed the three-pillar framework of MDCG 2020-1 for Medical Device Software clinical evidence (Valid Clinical Association, Technical / Analytical Performance, Clinical Performance), the hierarchical ranking of clinical evidence under MDCG 2020-6 Appendix III, the surrogate-endpoint anchoring required for an indirect clinical-benefit demonstration in a Class IIb decision-support device, and the clinical-evaluation methodology and reporting expectations of MEDDEV 2.7/1 Rev 4. The external methodological review is a complementary input that informs the evaluation team's pillar mapping, evidence ranking and indirect-benefit causal-chain reasoning; it does not substitute for the named evaluation team's appraisal and conclusions, and it does not constitute case-level clinical adjudication. The competence footprint of the Horiana team is biostatistical and methodological; clinical-content appraisal (including Valid Clinical Association for dermatological conditions and the overall benefit-risk assessment) remains exclusively with the named evaluation team. The CVs and signed Declarations of Potential Conflict of Interest of the Horiana review team (Céline Fabre, Antoine Giraud, Coralie Cantarel and Fabienne Diaz) are provided in Annex I.

Subject Matter Expert coverage across the indication scope​

The intended use spans 346 ICD-11 dermatological categories. Where the evaluation requires dermatology-subspecialty expertise outside general dermatology, the Subject Matter Expert (board-certified dermatologist, 17 years of clinical experience) anchors the benefit-risk assessment with structured indirect evidence appraised in this evaluation rather than delegating case-level adjudication to an unbounded external network. The subspecialty-coverage strategy and the records in which it is documented are summarised below:

Subspecialty coverageEvidence relied uponDocumented in
Dermatopathology / malignancyBiopsy-confirmed histopathology used as the reference standard in the pivotal malignancy investigation (MC_EVCDAO_2019)Clinical Investigation Report R-TF-015-006 under MC_EVCDAO_2019
Severity-scale interpretationPeer-reviewed severity-validation literature (APASI, AUAS, ASCORAD) and the in-house AIHS4 severity-scoring validation investigationR-TF-015-011 State-of-the-Art severity-scale literature appraisal; Clinical Investigation Report R-TF-015-006 under AIHS4_CSP_2025
Autoimmune and genodermatosesStructured literature review of image-based clinical recognition (22 load-bearing anchors CRIT1–7 ≥ 15/21) supporting the Pillar 1 Valid Clinical Association, and per-epidemiological-group ICD V&V measured on the device's stand-alone analytical output without a clinician in the loop (autoimmune AUC 0.948 with 95 % CI 0.941 – 0.954; genodermatoses AUC 0.905 with 95 % CI 0.886 – 0.924; both above the ≥ 0.80 acceptance criterion) supporting Pillar 2 Technical PerformanceR-TF-015-011 State of the Art §Autoimmune and genodermatoses; R-TF-028-006 AI Release Report §Per-Epidemiological-Group Performance
Fitzpatrick V–VI phototype coverageMulti-reader multi-case reader study dedicated to darker phototypes (MAN_2025)Clinical Investigation Report R-TF-015-006 and Annex E R-TF-015-010 under MAN_2025
Legacy-device indication breadth (post-market observational)Legacy-device cross-sectional real-world study — Rank 8 primary per MDCG 2020-6 Appendix III with a supplementary Rank 4 caseR-TF-015-012
Legacy-device indication breadth (passive PMS aggregate)Multi-year passive post-market surveillance corpus of the legacy predecessor deviceLegacy-device umbrella PMS Report R-TF-007-003

Where case-level subspecialty adjudication is required during a clinical investigation, external dermatologists are engaged on a study-specific basis under the third-party CRO governing that study, and their adjudication is recorded in the investigation records of the relevant study (see the corresponding Clinical Investigation Plan R-TF-015-004 and Clinical Investigation Report R-TF-015-006 per investigation). PMCF Activities documented in R-TF-007-002 capture additional subspecialty-stratified data and inform subsequent CER updates.

Identification of relevant product requirements​

In view of the willingness to demonstrate the scientific validity of the exposed data, it is essential to elaborate a clear and methodological plan for the identification, retrieval, appraisal and weighting of clinical data. The endpoints must be consistent with currently accepted scientific standards and must be justified according to the State of the Art (SotA).

Taking into account the regulatory status of the product under the scope of the clinical evaluation, the organization considers the applicable General Safety and Performance Requirements (GSPR) by demonstrating its compliance with the product by using clinical data providing sufficient clinical evidence. The level of clinical evidence will be appropriate considering the risk class, intended use and characteristics of the product under the scope.

According to the MDCG 2020-1 guideline, there are three key components to be taken into account when compiling clinical evidence:

  • Valid clinical association is understood as the extent to which the MDSW's output (e.g. concept, conclusion, calculations) based on the inputs and algorithms selected, is associated with the targeted physiological state or clinical condition. This association should be well-founded or clinically accepted. The valid clinical association of an MDSW should demonstrate that it corresponds to the clinical situation, condition, indication or parameter defined in the intended purpose of the MDSW.
  • Validation of the technical performance is the demonstration of the MDSW's ability to accurately, reliably and precisely generate the intended output, from the input data.
  • Validation of the clinical performance is the demonstration of an MDSW's ability to yield clinically relevant output in accordance with the intended purpose. Clinical relevance is a positive impact in any of the following forms:
    • On the health of an individual expressed in terms of measurable, patient-relevant clinical outcome(s), including outcome(s) related to diagnosis, prediction of risk, prediction of treatment response(s).
    • Related to its function, such as that of screening, monitoring, diagnosis or aid to diagnosis of patients.
    • On patient management or public health.

As a minimum, clinical data will be used to show that the device is safe, has an acceptable benefit/risk profile, performs as intended (GSPR 1), has an acceptable side-effect profile (GSPR 8), and meets the software-specific repeatability, reliability and performance requirements (GSPR 17), as outlined in Annex I of MDR 2017/745 and in the following table.

Coverage of additional GSPRs requiring clinical data for an MDSW​

GSPRs 1, 8 and 17 are the principal MDR Annex I requirements that need clinical evidence for this device. Additional GSPRs that bear on a Class IIb decision-support MDSW are addressed as follows:

  • GSPR 3 (risk-management system): managed across the product lifecycle under R-TF-013-001 (Risk Management Plan), R-TF-013-002 (Risk Management File) and R-TF-013-003 (Risk Management Report); clinical data from this evaluation feeds the post-clinical-evaluation review of the risk-benefit profile per ISO 14971:2019 §10.
  • GSPR 4 (risk-control measures): verified through the safety-endpoint analyses defined in section "Safety endpoints" above and reported in the CER under both the RMF threshold and the SotA baseline.
  • GSPR 9 (devices with measuring function): addressed by the algorithm-validation evidence for the device's quantitative outputs (severity scores: APASI, AUAS, AIHS4, ASCORAD), appraised in the CER under MDCG 2020-1 Pillar 2 against expert dermatologist consensus on the corresponding clinical scales.
  • MDR Annex I §22 (devices intended by the manufacturer to be used by lay persons): not applicable — the device is intended for use by healthcare professionals only; lay-person use is outside the intended use and is excluded by the Intended Purpose and the IFU.
  • GSPR 18 (active devices and devices connected to them): addressed by the system-integration and interoperability V&V (HL7 FHIR conformance and the device's documented application-programming interface), with clinical relevance assessed through the studies that exercised the integrated workflow (DAO_Derivación_O_2022, DAO_Derivación_PH_2022).
  • GSPR 22 (protection against unauthorised access and cybersecurity): addressed under the Cybersecurity programme (GP-030, IEC 81001-5-1:2021) and verified by the cybersecurity V&V records; clinical relevance is implicit in the safety-endpoint analyses (no incident affecting clinical care over four or more years of continuous legacy-predecessor commercial deployment).
  • GSPR 23 (information supplied by the manufacturer, labelling and IFU): addressed by the IFU and labelling V&V records and by the summative usability evidence (R-TF-025-007).
IDAnnex I RequirementJustification
1Devices shall achieve the performance intended by their manufacturer and shall be designed and manufactured in such a way that, during normal conditions of use, they are suitable for their intended purpose. They shall be safe and effective and shall not compromise the clinical condition or the safety of patients, or the safety and health of users or, where applicable, other persons, provided that any risks which may be associated with their use constitute acceptable risks when weighed against the benefits to the patient and are compatible with a high level of protection of health and safety, taking into account the generally acknowledged state of the art.According to the Regulation (EU) 2017/745, the evaluation of the clinical performance and safety as well as the clinical benefit must be based on ‘clinical data’ and is required for all medical device classes, consequently clinical investigations are essential to prove this requirement.
8All known and foreseeable risks, and any undesirable side-effects, shall be minimized and be acceptable when weighed against the evaluated benefits to the patient and/or user arising from the achieved performance of the device during normal conditions of use.According to the Regulation (EU) 2017/745, the evaluation of the undesirable side-effects and of the acceptability of the benefit- risk- ratio shall be based on clinical data providing sufficient clinical evidence.
17(17.1) Devices that incorporate electronic programmable systems, including software, or software that are devices in themselves, shall be designed to ensure repeatability, reliability and performance in line with their intended use. In the event of a single fault condition, appropriate means shall be adopted to eliminate or reduce as far as possible consequent risks or impairment of performance. (17.2) For devices that incorporate software or for software that are devices in themselves, the software shall be developed and manufactured in accordance with the state of the art taking into account the principles of development life cycle, risk management, including information security, verification and validation. (17.3) Software referred to in this Section that is intended to be used in combination with mobile computing platforms shall be designed and manufactured taking into account the specific features of the mobile platform (e.g. size and contrast ratio of the screen) and the external factors related to their use (varying environment as regards level of light or noise).17.1: clinical reliability and repeatability can only be demonstrated with clinical data. "Performance in line with their intended use" is, by definition, the validated clinical performance proven in a study. 17.2: According to the Regulation (EU) 2017/745, the validation phase of the software development lifecycle must include clinical validation. The planned clinical investigation is the primary means of fulfilling this requirement. 17.3: This sub-point requires accounting for the specifics of mobile platforms (e.g., screen size, ambient light). A clinical investigation is the ultimate test to prove that the device's performance is maintained in these variable, real-world usage scenarios.

Description​

The description written in this section is detailed enough to allow for a valid evaluation of the state of compliance with GSPR, the retrieval of meaningful literature data, and the assessment of equivalence to other devices described in the scientific literature.

The device is a Class IIb medical device with Rule 11 applied in accordance with Annex VIII Chapter III of Regulation (EU) 2017/745. It is undergoing first CE-marking under the MDR as a new device succeeding the legacy predecessor device (CE-marked under MDD since 2020), which remains on the market under Article 120(3) transition provisions. The legacy predecessor has undergone no significant changes since MDD CE-marking and its continuous evaluation through post-market activities provides equivalent-device clinical data per MDR Article 61(5)-(6), leveraged in this evaluation under the MDCG 2020-5 equivalence framework.

Device identification​

Information
Device nameLegit.Health Plus (hereinafter, the device)
Model and typeNA
Version1.1.0.0
Basic UDI-DI8437025550LegitCADx6X
Certificate number (if available)MDR 000000 (Pending)
EMDN code(s)Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software)
GMDN code65975
EU MDR 2017/745Class IIb
EU MDR Classification ruleRule 11
Novel product (True/False)TRUE
Novel related clinical procedure (True/False)TRUE
SRNES-MF-000025345

Manufacturer identification​

Manufacturer data
Legal manufacturer nameAI Labs Group S.L.
AddressStreet Gran Vía 1, BAT Tower, 48001, Bilbao, Bizkaia (Spain)
SRNES-MF-000025345
Person responsible for regulatory complianceAlfonso Medela, Saray Ugidos
E-mailoffice@legit.health
Phone+34 638127476
TrademarkLegit.Health
Authorized RepresentativeNot applicable (manufacturer is based in EU)

Intended use

The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures, enhancing efficiency and accuracy of care delivery, by providing:

  • an interpretative distribution representation of possible International Classification of Diseases (ICD) categories that might be represented in the pixels content of the image
  • quantifiable data on the intensity, count and extent of clinical signs such as erythema, desquamation, and induration, among others

Quantification of intensity, count and extent of visible clinical signs

The device provides quantifiable data on the intensity, count and extent of clinical signs such as erythema, desquamation, and induration, among others; including, but not limited to:

  • erythema,
  • desquamation,
  • induration,
  • crusting,
  • xerosis (dryness),
  • swelling (oedema),
  • oozing,
  • excoriation,
  • lichenification,
  • exudation,
  • wound depth,
  • wound border,
  • undermining,
  • hair loss,
  • necrotic tissue,
  • granulation tissue,
  • epithelialization,
  • nodule,
  • papule
  • pustule,
  • cyst,
  • comedone,
  • abscess,
  • hive,
  • draining tunnel,
  • non-draining tunnel,
  • inflammatory lesion,
  • exposed wound, bone and/or adjacent tissues,
  • slough or biofilm,
  • maceration,
  • external material over the lesion,
  • hypopigmentation or depigmentation,
  • hyperpigmentation,
  • scar,
  • scab,
  • spot,
  • blister

Image-based recognition of visible ICD categories

The device is intended to provide an interpretative distribution representation of possible International Classification of Diseases (ICD) categories that might be represented in the pixels content of the image.

Device description

The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.

The generated data is intended to aid healthcare practitioners and organizations in their clinical decision-making process, thus enhancing the efficiency and accuracy of care delivery.

The device should never be used to confirm a clinical diagnosis. On the contrary, its result is one element of the overall clinical assessment. Indeed, the device is designed to be used when a healthcare practitioner chooses to obtain additional information to consider a decision.

Intended medical indication

The device is indicated for use on images of visible skin structure abnormalities to support the assessment of all diseases of the skin incorporating conditions affecting the epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis).

Intended patient population

The device is intended for use in adult and paediatric patients presenting with skin findings across Fitzpatrick phototypes I-VI, in primary care, general dermatology, and specialist referral settings.

Intended user

The medical device is intended for use by healthcare providers to aid in the assessment of skin structures.

User qualifications and competencies

This section outlines the qualifications and competencies required for users of the device to ensure its safe and effective use. It is assumed that all users already possess the baseline qualifications and competencies associated with their respective professional roles.

Healthcare professionals

No additional official qualifications are required for healthcare professionals (HCPs) to use the device. However, it is recommended that HCPs possess the following competencies to optimize device utilization:

  • Proficiency in capturing high-quality clinical images using smartphones or equivalent digital devices.
  • Basic understanding of the clinical context in which the device is applied.
  • Familiarity with interpreting digital health data as part of the clinical decision-making process.

The device may be used by any healthcare professional who, by virtue of their academic degree, professional license, or recognized qualification, is authorized to provide healthcare services. This includes, but is not limited to:

  • Medical Doctors (MD, MBBS, DO, Dr. med., or equivalent)
  • Registered Nurses (RN, BScN, MScN, Dipl. Pflegefachfrau/-mann, or equivalent)
  • Nurse Practitioners (NP, Advanced Nurse Practitioner, or equivalent)
  • Physician Assistants (PA, or equivalent roles such as Physician Associate in the UK/EU)
  • Dermatologists (board-certified, Facharzt für Dermatologie, or equivalent)
  • Other licensed or registered healthcare professionals as recognized by local, national, or European regulatory authorities

Each HCP must hold the academic title, degree, or professional registration that confers their status as a healthcare professional in their jurisdiction, whether in the United States, Europe, or other regions where the device is provided.

IT professionals

IT professionals are responsible for the technical integration, configuration, and maintenance of the medical device within the healthcare organization's information systems.

No specific official qualifications are mandated. Nevertheless, it is advisable that IT professionals involved in the deployment and support of the device have the following competencies:

  • Foundational knowledge of the HL7 FHIR (Fast Healthcare Interoperability Resources) standard and its application in healthcare data exchange.
  • Ability to interpret and manage the device's data outputs, including integration with electronic health record (EHR) systems.
  • Understanding of healthcare data privacy and security requirements relevant to medical device integration, including GDPR (Europe), HIPAA (US), and other applicable local regulations.
  • Experience with troubleshooting and supporting clinical software in a healthcare environment.
  • Familiarity with IT standards and best practices for healthcare, such as ISO/IEC 27001 (Information Security Management) and ISO 27799 (Health Informatics—Information Security Management in Health).

IT professionals may include, but are not limited to:

  • Health Informatics Specialists (MSc Health Informatics, or equivalent)
  • Clinical IT System Administrators
  • Healthcare Integration Engineers
  • IT Managers and Project Managers in healthcare settings
  • Software Engineers and Developers specializing in healthcare IT
  • Other IT professionals with relevant experience in healthcare environments, as recognized by local, national, or European authorities

Each IT professional should possess the relevant academic degree, professional certification, or demonstrable experience that qualifies them for their role in the healthcare organization, in accordance with the requirements of the United States, Europe, or other regions where the device is provided.

Use environment

The device is intended to be used in the setting of healthcare organisations and their IT departments, which commonly are situated inside hospitals or other clinical facilities.

The device is intended to be integrated into the healthcare organisation's system by IT professionals.

Operating principle

The device is computational medical tool leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures.

Body structures

The device is intended to use on the epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis).

In fact, the device is intended to use on visible skin structures. As such, it can only quantify clinical signs that are visible, and distribute the probabilities across ICD categories that are visible.

Explainability

For visual signs that can be quantified in terms of count and extent, the underlying models not only calculate a final value, such as the number of lesions, but also determine their locations within the image. Consequently, the output for these visual signs is accompanied by additional data, which varies depending on whether the quantification involves count or extent.

  • Count. When a visual sign is quantifyed by counting, the device generates bounding boxes for each detected entity. These bounding boxes are defined by their x and y coordinates, as well as their height and width in pixels.
  • Extent. When a visual sign is quantifyed by its extent, the device outputs a mask. This mask, which is the same size as the image, consists of 0's for pixels where the visual sign is absent and 1's for pixels where it is present.

The explainability output can be found with the explainabilityMedia key. Here is an example:

{
"explainabilityMedia": {
"explainabilityMedia": {
"content": "base 64 image",
"detections": [
{
"confidence": 98,
"label": "nodule",
"p1": {
"x": 202,
"y": 101
},
"p2": {
"x": 252,
"y": 154
}
},
{
"confidence": 92,
"label": "pustule",
"p1": {
"x": 130,
"y": 194
},
"p2": {
"x": 179,
"y": 245
}
}
]
}
}
}

Contraindications and precautions required by the manufacturer​

Contraindications​

We advise not to use the device if:

  • Skin structures located at a distance greater than 1 cm from the eye, beyond the optimal range for examination.
  • Skin areas that are obscured from view, situated within skin folds or concealed in other manners, making them inaccessible for camera examination.
  • Regions of the skin showcasing scars or fibrosis, indicative of past injuries or trauma.
  • Skin structures exhibiting extensive damage, characterized by severe ulcerations or active bleeding.
  • Skin structures contaminated with foreign substances, including but not limited to tattoos and creams.
  • Skin structures situated at anatomically special sites, such as underneath the nails, requiring special attention.
  • Portions of skin that are densely covered with hair, potentially obstructing the view and hindering examination.

Precautions​

To use the device safely, please consider the following precautions:

  • The device must always be used by a HCP, who should confirm or validate the output of the device considering the medical history of the patient, and other possible symptoms they could be suffering, especially those that are not visible or have not been supplied to the device.
  • The device must be used according to its intended use.
  • Before using the device, please read the Instructions for Use.

Warnings​

In the event of observed incorrect operation of the device, users must notify the manufacturer as soon as possible through the support channel indicated in the IFU. Any serious incident must be reported to the manufacturer and to the national competent authority of the country where the incident occurred, in accordance with MDR Article 87.

Undesirable effects​

Any undesirable side-effect should constitute an acceptable risk when weighed against the performances intended.

It is not known or foreseen any undesirable side-effects specifically related to the use of the software.

Intended clinical benefits​

The device provides clinical benefits by enhancing the precision and efficiency of dermatological assessments through advanced image analysis of visible skin structures. By quantifying the intensity, count, and extent of clinical signs, it offers detailed and consistent data, which aids healthcare providers in evaluating a wide range of skin conditions across the epidermis, appendages, associated mucous membranes, dermis, cutaneous vasculature, and subcutis.

Additionally, it interprets and maps possible ICD classifications, streamlining clinical workflows and supporting standardised assessments for more accurate, evidence-based patient care. Taken together, these capabilities lower waiting times for specialist consultation and reduce time to diagnosis by facilitating access to structured clinical information; each of these outcomes is defined as an operational acceptance criterion in the benefits table below (see benefit 3KX) and evaluated against pre-specified SotA-anchored thresholds.

Clarification on "Multiple conditions"​

Throughout the clinical evaluation plan, the indication label "Multiple conditions" is utilised in several performance claims. The label does not refer to an unspecified group of diseases. It refers to the subset of ICD-11 categories represented in the Pillar 3 pivotal investigations, evaluated on the device's mandated Top-5 prioritised differential view. This is distinct from the 346-ICD-11-category scope of the Pillar 2 stand-alone analytical claim, which is measured on curated input-output pairs without a clinician in the loop and is evidenced by the AI model verification-and-validation in R-TF-028-005.

The per-study ICD-11 category composition for each investigation that reports under "Multiple conditions" is documented in the CER (R-TF-015-003), section "Per-study case-mix and ICD-11 representation", which gives for each study the list of ICD-11 categories represented, the number of cases per category, and the corresponding Top-5 / Top-3 / Top-1 performance at the clinician-decision point. The term "Multiple conditions" therefore denotes a well-defined, study-specific subset of the 346-ICD-11 analytical scope.

The following subsections define, for each claimed clinical benefit, the intended clinical outcome, the measurable acceptance criteria, and the clinical investigation evidence planned or available to demonstrate achievement. Each acceptance criterion has been systematically derived from the state-of-the-art (SotA) literature as documented in the "Acceptance Criteria Derivation from State of the Art" section of R-TF-015-003 (Clinical Evaluation Report) and in "Methodology for Establishing Acceptance Criteria" in R-TF-015-011 (State of the Art). Detailed analytical links between each criterion and the specific SotA articles, baselines, and statistical methodology are provided in those documents.

Worked example: derivation of the malignancy detection acceptance criterion​

To make the derivation transparent at the CEP level, the following worked example walks through one acceptance criterion in full. The remaining criteria follow the same three-stage workflow (extraction → synthesis → margin) and are documented per criterion in the CER's per-domain margin table.

  • Domain: Benefit 7GH, sub-criterion (c) malignant lesions — multiple malignant conditions, pooled AUC.
  • Stage 1 (Extraction): From the SotA corpus appraised in R-TF-015-011, a set of primary multi-malignancy AI-assisted detection studies is extracted (Maron 2019, Han 2020, Ahadi 2021, Tepedino 2024, Tschandl 2019, plus supplementary non-specialist benchmarks Jones 2022, Jaklitsch 2023, and Ferris 2025). Each study contributes Sensitivity, Specificity and AUC at the operating point closest to the device's intended use (decision-support; primary-care-and-dermatology setting).
  • Stage 2 (Synthesis): A meta-analytic weighted average is computed across the extracted studies, weighted by inverse variance: SotA AUC baseline = 0.778 (95% confidence interval 0.74-0.80); SotA Sensitivity = 0.76 (95% CI 0.70-0.82); SotA Specificity = 0.79 (95% CI 0.71-0.85). The full meta-analytic computation is documented in R-TF-015-011.
  • Stage 3 (Margin): A clinical-significance margin is added to the synthesised baseline. For multi-malignancy detection in the device's primary-care intended-use setting, the margin is +12.2 percentage points on AUC (target AUC ≥ 0.90). This margin is calibrated against the upper end of supplementary non-specialist BCC/SCC benchmarks (mean AUC 87.5-92.3% per Jones 2022 and Ferris 2025) — the device is required to perform at the top of the published primary-care AI-assisted range. The margin reflects the safety-critical nature of multi-malignancy detection in the primary-care setting (false negatives carry the worst clinical consequences). Additional per-metric margins (Sensitivity ≥ 0.79, Specificity ≥ 0.87) are derived analogously and are documented in the CER's per-domain margin table.
  • Result: The final acceptance criterion for Benefit 7GH (c) malignant lesions, multi-malignancy pooled, is AUC ≥ 0.90, Sensitivity ≥ 0.79, Specificity ≥ 0.87 — superior to the SotA-meta-analytic baseline by the documented margins.
  • Pre-specification: The acceptance criteria above were established in advance of analysis, in alignment with the literature corpus, and were not adjusted post-hoc against observed device performance. The pre-specification is documented in this CEP and in R-TF-015-011.
Acceptance-criteria derivation: per-benefit SotA anchors and margins​

Every acceptance criterion listed in the Clinical Benefits table below has been derived using the same three-stage workflow (Extraction → Synthesis → Margin) illustrated in the worked example above. The table below states, for each criterion, (i) the SotA baseline derived from the primary literature corpus appraised in R-TF-015-011, (ii) the clinical-significance margin added on top of the baseline, (iii) the resulting device-level acceptance criterion, and (iv) the pillar the criterion addresses under the MDCG 2020-1 three-pillar framework. A single numeric threshold never simultaneously addresses both Pillar 2 (the device's stand-alone analytical output target, measured without a clinician in the loop) and Pillar 3 (the clinician + device target measured on the Top-5 prioritised differential view); the Pillar column resolves the target genre for every row.

BenefitSub-criterionSotA baseline (literature anchor)Margin addedDevice-level acceptance criterionPillar the criterion addresses
7GH(a)General conditions — all HCP tiersUnaided HCP diagnostic-accuracy baseline ≈ 56% (Tschandl 2019; Brinker 2019; Haenssle 2018) across mixed ICD-11 dermatological presentations+15 pp (Ferri et al. 2020 reference improvement)Accuracy ≥71% (improvement ≥15 pp); sensitivity ≥74% (improvement ≥18 pp); specificity ≥78% (improvement ≥19 pp)Pillar 3 (clinician + device on Top-5 prioritised differential; measured on real-patient prospective investigations DAO_Derivación_PH_2022, BI_2024, PH_2024, SAN_2024; supported by MAN_2025 Rank 11)
7GH(a)General conditions — primary careUnaided PCP diagnostic-accuracy baseline ≈ 53% (Warshaw 2011; Kroemer 2011)+18 ppAccuracy ≥71% (improvement ≥18 pp); sensitivity ≥74% (improvement ≥19 pp); specificity ≥80% (improvement ≥20 pp)Pillar 3 (clinician + device on Top-5; PCP stratum)
7GH(a)General conditions — dermatologistsUnaided dermatologist diagnostic-accuracy baseline ≈ 64% (Haenssle 2018; Tschandl 2019 ISIC)+9 ppAccuracy ≥73% (improvement ≥9 pp); sensitivity ≥76% (improvement ≥10 pp); specificity ≥82% (improvement ≥10 pp)Pillar 3 (clinician + device on Top-5; dermatologist stratum)
7GH(b)Rare diseases — all HCP tiersUnaided HCP diagnostic accuracy for rare dermatoses ≈ 28% (Marks 2018; Fuertes 2023 GPP primary-care series); rare-disease misdiagnosis literature+26 ppAccuracy ≥54% (improvement ≥26 pp); sensitivity ≥41% (improvement ≥21 pp); specificity ≥60% (improvement ≥22 pp)Pillar 3 (clinician + device on Top-5 for rare-disease subgroup; BI_2024 primary, PH_2024 supporting)
7GH(c)Malignant lesions — stand-alone analytical AUCMeta-analytic AUC 0.778 (95% CI 0.74–0.80) over Maron 2019, Han 2020, Ahadi 2021, Tepedino 2024, Tschandl 2019, plus Jones 2022, Jaklitsch 2023, Ferris 2025+12.2 pp AUCAUC ≥0.90 against histopathology ground truth on curated input-output pairs measured on the device's stand-alone output without a clinician in the loopPillar 2 (device's stand-alone analytical claim across the 346 ICD-11 analytical output space, evidenced by the AI model verification-and-validation in R-TF-028-005 measured on curated input-output pairs without a clinician in the loop; NMSC_2025 is not cited in this Pillar 2 row — its contribution is to the separate Pillar 3 clinician+device decision-point row below)
7GH(c)Malignant lesions — clinician+device decision pointSotA comparators at the clinician+device decision point on the Top-5 prioritised differential view and malignancy-prioritisation gauge+12.2 pp AUC (device-level pooled); per-metric margins at the decision pointSensitivity ≥0.79; specificity ≥0.87; PPV/NPV as specified for the malignancy decision at the mandated user-interface surfacePillar 3 (clinician + device on Top-5 prioritised differential view / malignancy-prioritisation gauge; measured on MC_EVCDAO_2019, IDEI_2023 malignant arm, NMSC_2025 Pillar 3 component)
7GH(c)Melanoma detection — study-specificStudy-level pre-specified AUC ≥0.80 (MC_EVCDAO_2019 primary endpoint; derived from Esteva 2017 and Haenssle 2018)Separate margin for device-level pooled target (+1 pp)Melanoma: AUC ≥0.81 (device-level pooled); study pre-specified MC_EVCDAO_2019 AUC ≥0.80; sensitivity ≥93%; specificity ≥80%Pillar 3 (real-patient prospective; histopathology reference standard)
5RBSeverity — HS ICCPublished HS interobserver ICC 0.41–0.68 (Zouboulis 2017; Hessam 2018 IHS4 validation)Calibrated to top of published human interobserver bandICC ≥0.7270 (gold-standard IHS4 agreement)Pillar 2 (device's stand-alone analytical output vs. expert consensus; measured without a clinician in the loop; AIHS4_2023 and the pilot-feasibility AIHS4_2025). Pre-market Pillar 3 Clinical Performance confirmation of HS severity in the clinician-in-the-loop workflow is declared an acceptable evidence gap under MDCG 2020-6 §6.5(e), with pre-specified discharge via PMCF Activity B.1 in R-TF-007-002 (pre-specified enrolment triggers, ICC acceptance threshold and re-opening condition; §6.5(e) declaration restated in the CER).
5RBSeverity — PASI/UAS/SCORAD RMAEPublished intra-observer and inter-rater variability 10–18 % RMAE across PASI, UAS, SCORADCalibrated to lower bound of published variabilityRMAE ≤15% for PASI, UAS, SCORADPillar 2 (device's stand-alone analytical output vs. expert consensus; APASI_2025, AUAS_2023, ASCORAD_2022). Pre-market Pillar 3 Clinical Performance confirmation of PASI/UAS/SCORAD severity in the clinician-in-the-loop workflow is declared an acceptable evidence gap under MDCG 2020-6 §6.5(e), with pre-specified discharge via COVIDX_EVCDAO_2022 (longitudinal monitoring contributing Pillar 3 supporting evidence) and PMCF Activities B.2–B.4 in R-TF-007-002 (pre-specified enrolment triggers, RMAE acceptance threshold and re-opening condition; §6.5(e) declaration restated in the CER).
5RBSeverity — androgenetic alopecia kappaPublished androgenetic-alopecia scoring interrater kappa 0.35–0.55 (Ludwig 1977 scale; Olsen 2005)+0.05–0.10 over upper bound of published interraterUnweighted kappa ≥0.60; correlation ≥0.65Pillar 3 (clinician-in-the-loop agreement at the severity-output user-interface surface; IDEI_2023 androgenetic-alopecia prospective endpoint). Pillar 2 algorithm-vs-expert agreement for the androgenetic-alopecia algorithm is evidenced separately in R-TF-028-005 and is not pooled with this Pillar 3 row. PMCF Activity B.5 in R-TF-007-002 confirms and strengthens the pre-market Pillar 3 determination in real-world deployment.
3KX(a)Waiting times — cumulative reductionPublished dermatology waiting-time medians 45–120 days (Eedy 2007; NHS data)≥50 % reduction in cumulative waiting time≥50 % reduction in cumulative waiting time for specialist dermatological carePillar 3 (clinician-in-the-loop workflow; DAO_Derivación_O_2022, DAO_Derivación_PH_2022)
3KX(b)Referral adequacyUnaided PCP referral appropriateness ≈ 55% (Warshaw 2011 teledermatology; Kroemer 2011)+30 % reduction in unnecessary referralsReduction of unnecessary referrals ≥30%; sensitivity ≥70%; specificity ≥65% (identifying necessary referrals)Pillar 3 (clinician-in-the-loop workflow; DAO_Derivación_O_2022)
3KX(c)Remote carePublished teledermatology handled-remotely fraction 40–55% (Yim 2015; Lee 2018)Calibrated to top of published bandAt least 55% of patients handled remotely; ≥30% sensitivity improvement identifying necessary referrals remotely; ≥65% specificity identifying unnecessary referrals remotelyPillar 3 (clinician-in-the-loop workflow; teleconsultation arm of DAO_Derivación_O_2022)

The detailed per-article meta-analytic derivations, inverse-variance weighting, and margin-selection rationale for every row above are documented in R-TF-015-003 (Clinical Evaluation Report) section "Acceptance-criteria derivation from the state of the art" and in R-TF-015-011 section "Methodology for establishing acceptance criteria".

IDIntended Clinical BenefitsOutcome MeasuresMagnitude of benefit claimed
7GHThe device improves the accuracy of healthcare professionals in the diagnosis of dermatological conditions across a broad spectrum of clinical presentations, including rare diseases and lesions suspicious for skin cancer. This has a positive impact on patient management and health outcomes related to diagnosis, enabling more appropriate clinical decision-making, earlier identification of rare conditions, and, in cases of suspected malignancy, reducing the risk of delayed diagnosis and the need for unnecessary invasive procedures.(a) General conditions: - Top-1 diagnostic accuracy, sensitivity, and specificity — all HCP tiers (unaided vs. aided) - Top-1 diagnostic accuracy, sensitivity, and specificity — primary care (unaided vs. aided) - Top-1 diagnostic accuracy, sensitivity, and specificity — dermatologists (unaided vs. aided) (b) Rare diseases: - Top-1 diagnostic accuracy, sensitivity, and specificity — all HCP tiers (unaided vs. aided) - Same metrics stratified for primary care and dermatologists (c) Malignant lesions: - Area Under the ROC Curve (AUC) detecting malignancy - Sensitivity and specificity detecting malignancy - Positive/Negative Predictive Value (PPV/NPV) in primary care and dermatology - AUC, accuracy, sensitivity, and specificity detecting melanoma(a) General conditions (Pillar 3): - All HCP tiers: accuracy ≥71% (improvement ≥15 pp); sensitivity ≥74% (improvement ≥18 pp); specificity ≥78% (improvement ≥19 pp) - Primary care: accuracy ≥71% (improvement ≥18 pp); sensitivity ≥74% (improvement ≥19 pp); specificity ≥80% (improvement ≥20 pp) - Dermatologists: accuracy ≥73% (improvement ≥9 pp); sensitivity ≥76% (improvement ≥10 pp); specificity ≥82% (improvement ≥10 pp) (b) Rare diseases (Pillar 3 — Top-5 surfacing claim; see "Tier 2 rare-disease claim scoping" below): - All HCP tiers: accuracy ≥54% (improvement ≥26 pp); sensitivity ≥41% (improvement ≥21 pp); specificity ≥60% (improvement ≥22 pp) - Same thresholds stratified for primary care and dermatologists (c) Malignant lesions: - Device-level pooled AUC detecting malignancy ≥0.90 on curated input-output pairs without a clinician in the loop (Pillar 2; see "Acceptance-criteria derivation" — stand-alone analytical row) - Clinician+device sensitivity detecting malignancy ≥79%; specificity ≥87% at the Top-5 prioritised differential view and malignancy-prioritisation gauge (Pillar 3; see "Acceptance-criteria derivation" — clinician+device decision-point row) - PPV in primary care ≥42%; NPV in primary care ≥96%; PPV in dermatology ≥89%; NPV in dermatology ≥82.5% (Pillar 3) - Melanoma detection — device-level pooled AUC ≥0.81 (see "Study-level vs. device-level acceptance criteria: reconciliation"; individual study pass/fail applied per each study's pre-specified CIP); sensitivity ≥93%; specificity ≥80%; accuracy ≥81%
5RBThe device measures the degree of involvement of disease objectively, quantitatively, and reproducibly. This increases the precision of healthcare providers during the monitoring of patients. This has a positive impact on patient management and outcomes related to the monitoring of patients and treatment.Quantitative device metrics (per-row Pillar assignment per the "Acceptance-criteria derivation" table above: HS ICC and PASI/UAS/SCORAD RMAE rows are Pillar 2 primary with §6.5(e) Pillar 3 discharge via PMCF; androgenetic-alopecia kappa row is Pillar 3 primary via IDEI_2023): - Inter-observer Intraclass correlation coefficient - Intra-class intra-observer correlation variability - Correlation assessing androgenetic alopecia severity between the device and HCPs - Unweighted Kappa assessing androgenetic alopecia severity - Relative Mean Absolute Error (RMAE) between device severity scores and expert dermatologist consensus for PASI (psoriasis), UAS (urticaria), and SCORAD (atopic dermatitis) Clinical-utility perception items (Pillar 3 supporting; Rank 8 proactive PMS / qualitative): - Experts consider the device a positive tool for increasing objectivity in patient monitoring through the Clinical Utility Questionnaire - Healthcare professionals express a preference for a software-based tool to identify the severity of casesQuantitative device metrics (Pillar 2): - ICC ≥0.7270 (gold-standard IHS4 agreement) - Intra-class intra-observer variability < 10% between consecutive visits - Correlation ≥0.65 assessing androgenetic alopecia; unweighted kappa ≥0.60 - RMAE ≤15% between the device's severity scores and expert dermatologist consensus for PASI (psoriasis), UAS (urticaria), and SCORAD (atopic dermatitis) Clinical-utility perception items (supporting): - At least 80% of experts express a preference for a software-based tool to identify the severity of cases
3KXThe device improves the precision of healthcare professionals in managing dermatological care pathways, encompassing referral decisions, resource allocation, and clinical assessment in remote care settings. This has a positive impact on patient management and outcomes related to the diagnosis and monitoring of patients, resulting in reduced waiting times for specialist consultation, improved adequacy of referrals, and expanded access to dermatological assessment across in-person and remote care settings.(a) Waiting times: - Reduction in cumulative waiting time in percentage to see the specialist - Reduction in cumulative waiting time to see the specialist compared to the region - Experts consider that the use of the device enables specialists to complete consultations in less time (b) Referral adequacy: - Reduction in unnecessary referrals without increasing false negatives - Sensitivity to identify necessary referrals - Specificity to identify unnecessary referrals (c) Remote care: - Number of patients that can be handled remotely - Sensitivity to identify necessary referrals in teledermatology - Specificity to identify unnecessary referrals in teledermatology - Experts considered the device a useful tool to gather more patient information - Experts perceive the use of the device for diagnostic support as very useful(a) Waiting times: - A reduction of cumulative waiting time for patients to access specialist dermatological care of at least 50%. - A reduction of cumulative waiting time in days to access specialist dermatological care. - More than 70% of experts state that the use of the device allows them to handle consultations in 5 to 10 minutes and reduce the time needed. (b) Referral adequacy: - A reduction of at least 30% of unnecessary referrals to dermatology. - A sensitivity of at least 70% identifying patients who require dermatological referral. - A specificity of at least 65% identifying patients who do not require dermatological referral. (c) Remote care: - At least 55% of patients can be handled remotely with the assistance of the device. - An improvement of at least 30% in sensitivity in identifying necessary referrals remotely. - A specificity of at least 65% in identifying unnecessary referrals remotely. - At least 70% of experts consider the device a useful tool to collect patient data during teleconsultations. - At least 70% of experts perceive the device as a useful tool to handle remotely most of patients.
Surrogate-to-patient-outcome chain for benefit 3KX​

Benefit 3KX is measured through workflow surrogates (waiting times, referral adequacy, remote-care adequacy). Each surrogate is linked to a patient-relevant outcome through the following causal chain, anchored by the Pillar 1 surrogate-endpoint validity evidence documented in R-TF-015-011 State of the Art §"Surrogate endpoint validity — by benefit domain":

  • 3KX(a) waiting times: reduction in cumulative waiting time → earlier specialist assessment for patients genuinely requiring specialist care → reduced diagnostic delay → reduced disease-progression risk for time-sensitive conditions (notably malignancy, inflammatory disease flares, paediatric presentations). Literature anchor: Moreno-Ramirez et al. 2007 (EU multicentre teledermatology evaluation, 2,009 teleconsultations, waiting-interval reduction 89 → 12 days with preserved melanoma detection) and Conic et al. 2018 (NCDB time-to-surgery → overall-survival gradient across 153,218 stage I–III melanoma patients).
  • 3KX(b) referral adequacy: reduction in unnecessary referrals without increasing false negatives → reduced specialist-waiting-list burden → improved access-to-care equity → reduced healthcare-system harm at unchanged or improved per-patient outcome. Literature anchor: Whited et al. 2013 (VA store-and-forward teledermatology RCT, 9-month clinical-course equivalence with shorter time-to-intervention); and the corresponding "Referral optimisation and care-pathway" subsection within R-TF-015-011 §"Surrogate endpoint validity".
  • 3KX(c) remote care: increase in fraction of patients handled remotely at maintained or improved diagnostic adequacy → reduced patient travel burden, reduced geographic access inequity, reduced opportunity cost of specialist face-to-face capacity → reduced healthcare-system cost at unchanged or improved per-patient outcome. Literature anchor: Armstrong et al. 2018 (12-month psoriasis online-vs-in-person equivalency RCT; PASI and BSA differences within pre-specified ±3 equivalence bound) and Snoswell et al. 2016 (systematic review of 14 economic evaluations of store-and-forward teledermatology, cost-effective or cost-saving in the majority of included studies).

Evaluation of the intended clinical benefits​

  • 7GH: This benefit will be assessed across three sub-criteria. (a) General conditions: Diagnostic accuracy of HCPs will be assessed aided and unaided by the device and compared to the state of the art. Top-1 diagnostic accuracy for benefit 7GH is measured as the clinician's diagnostic decision when presented with the device's mandated Top-5 prioritised differential view — not against the classifier's stand-alone top-ranked output over 346 ICD-11 categories. The Top-5 panel is the Pillar 3 user-interface surface validated in the pivotal investigations; the device's stand-alone 346-category analytical accuracy is a separate Pillar 2 claim. (b) Rare diseases: Diagnostic accuracy of HCPs for rare dermatological diseases will be assessed with and without the use of the device. (c) Malignant lesions: The capacity of the device to support detection of malignant conditions will be assessed in clinical validations and compared with the current state of the art using AUC as the primary metric.
  • 5RB: This benefit will be assessed in the clinical validations of the device. It will be measured whether the device can measure the severity of dermatological conditions in the same way as an expert dermatologist.
  • 3KX: This benefit will be assessed across three sub-criteria. (a) Waiting times: Reduction of cumulative waiting time with the use of the device by the HCPs. (b) Referral adequacy: The suitability and number of referrals deemed necessary with and without the assistance of the device by primary care practitioners. (c) Remote care: The capacity of the device to support remote patient management and the perception of experts of the utility of the device in teleconsultations.

Integrator integration requirements as risk controls​

The device mandates a defined set of integration requirements that every integrating system must implement for the device to deliver the Pillar 3 clinical benefit validated in the pivotal investigations. These requirements — (i) presentation of the Top-5 prioritised differential view, (ii) display of the malignancy-prioritisation gauge, (iii) display of the referral recommendation, and (iv) display of the six binary malignancy-surfacing safety indicators — are set out in the User Interface section of the Installation Manual in the Instructions for Use, where each requirement is stated with mandatory language and visibility/ordering/distinguishability constraints, and are traced to the corresponding risk-control entry or entries in R-TF-013-002 (R-BDR and R-A96 for the Top-5 ranked view; R-HBD, R-BDR, R-DAG, R-75H for the malignancy-prioritisation gauge; R-BDR and R-75H for the referral recommendation; R-BDR, R-HBD, R-SKK for the six binary safety indicators). The four mandated UI elements, each with its per-element constraint type, are:

  • Top-5 prioritised differential view: the ranked differential must be visible to the clinician at the decision point (visibility); the integrating system must preserve the device's rank order and must not re-order the differential (ordering); entries flagged as malignancy-suspect must be visually distinguishable from non-malignant entries within the ranked view (distinguishability).
  • Malignancy-prioritisation gauge: the gauge must be visible at the same decision point as the differential (visibility); the gauge must be visually separated from the differential view so that a clinician reads it as an independent malignancy-risk signal (distinguishability).
  • Referral recommendation: the recommendation must be visible at the decision point (visibility); the recommendation must be co-located with the differential view so that the clinician's referral-versus-manage decision is made against both signals in the same visual frame (co-location).
  • Six binary malignancy-surfacing safety indicators: each indicator must be visible at the mandated decision point whenever the device's output asserts the indicator (visibility); each of the six indicators must be unambiguously identifiable per-indicator so that a clinician reads each independently (distinguishability).

Full per-requirement mandatory language, per-element acceptance tests, and per-element traceability to R-TF-013-002 risk-control entries are set out in the IFU's User Interface section; the integrator-MUST envelope above is a CE-marking precondition and is enforced as part of the clinical-benefit claim.

We do not delegate user-interface presentation responsibility to the integrator: the integrator is a co-controlled risk-control agent whose obligation to implement these requirements is a precondition of the CE-marking clinical-benefit claim. Pillar 3 clinical-performance evidence in this CEP and its corresponding CER is measured on the device when integrated per these mandated user-interface-presentation requirements; the device is validated only under those conditions.

Device classification​

The device is classified as Class IIb under MDR Rule 11 (Annex VIII, Chapter III of Regulation (EU) 2017/745).

Applying MDCG 2019-11 Qualification and Classification of Software in MDR 2017/745, the device is an MDSW that significantly contributes to clinical decision-making. Under the IMDRF SaMD framework referenced by MDCG 2019-11, the device maps to significance-of-information category "Drives clinical management" (IMDRF 5.1.2) — its output, in particular the Top-5 prioritised differential view and the malignancy-prioritisation gauge, materially informs the clinician's diagnostic and referral decisions. Across the indication scope, the highest applicable IMDRF state-of-healthcare-situation cell is "Critical" (IMDRF 5.2.1), driven by the inclusion of melanoma (ICD-11 2C30) in the indication, with "Serious" (IMDRF 5.2.2) applying to the remaining dermatological conditions. The resulting SaMD category is III.i (Critical × Drives), which confirms Class IIb under MDR Rule 11. The full classification analysis — including the per-cell rationale, the rule-11 sub-rule selection, and the ICD-11-category-by-category mapping — is documented in the R-TF Device Description and Specification record (overview section of the technical file), subsection "Guideline MDCG 2019-11".

This classification drives the clinical-evidence bar applied throughout this Clinical Evaluation Plan: the evidence portfolio is designed to demonstrate conformity with the GSPRs applicable to a Class IIb MDSW under MDR.

Product category​

Software-only medical device.

Device variants and packaging​

No variants are available for the device.

Previous version of the device​

The predecessor of the current device is a legacy device (hereinafter, "the legacy device") CE-marked under the Medical Devices Directive (MDD) 93/42/EEC and on the market since 2020. The device under evaluation succeeds the legacy device under MDR. Differences between the legacy device and the device under evaluation are catalogued in the equivalence summary and categorised into two buckets (see section "Equivalence with the legacy device"): (a) architectural and deployment changes with no clinical pathway; and (b) algorithmically equivalent features deployed differently with verified output parity. All changes between the legacy device and the device under evaluation are non-significant per MDCG 2020-3 Rev.1. The core classifier architecture and clinical outputs carried forward unchanged are covered by equivalence per MDR Article 61(5)-(6) and MDCG 2020-5. Architectural and deployment changes are supported by software-architecture, interoperability, and cybersecurity V&V records. This approach aligns with the necessary transition from the MDD to the MDR, specifically addressing the requirements for:

  • Updated Technical Documentation: As mandated by Article 10(4) and Annexes II and III of the MDR, demonstrating conformity with the new General Safety and Performance Requirements (GSPR).
  • Post-Market Clinical Follow-up (PMCF) Data: The collected clinical data serves to strengthen the Clinical Evaluation Report (CER), in line with MDR Article 61 and Annex XIV, and relevant guidance from the Medical Device Coordination Group (MDCG) (e.g., MDCG 2020-13 on clinical evaluation and PMCF).
  • Demonstration of Equivalence: Since the core clinical technology remains identical, the updated documentation reinforces the equivalence to the legacy device, which is a key consideration when leveraging existing clinical data under the MDR. The demonstration of equivalence is documented in the Clinical Evaluation Report (CER).

The legacy device has been commercialized since 2020 (after obtaining the manufacturing license in Spain) and was certified under the Medical Devices Directive (MDD).

Components​

The device is a computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures.

Mode of action​

One core feature of the device is a deep learning-based image recognition technology for the recognition of ICD categories. In other words: when the device is fed an image or a set of images, it outputs an interpretative distribution representation of possible International Classification of Diseases (ICD) categories that might be represented in the pixels content of the image.

The device makes its prediction entirely based on the visual content of the images, with no additional parameters.

The device has been developed following an architecture called Vision Transformer (ViT). This architecture is inspired by the Transformer architecture, which is extensively used in other areas such as natural language processing (NLP) and has brought significant advancements in terms of performance.

Another core feature of the device is to provide quantifiable data on the intensity, count, and extent of clinical signs such as erythema, desquamation, and induration, among others.

To achieve that, the device uses a range of deep learning technologies, combined and developed for that specific use:

  • Object detection: used to count clinical signs such as hives, papules or nodules.
  • Semantic segmentation: used to determine the extent of clinical signs such as hair loss or erythema.
  • Image recognition: used to quantify the intensity of visual clinical signs like erythema, excoriation, dryness, lichenification, oozing, and edema.

Device lifecycle​

The device is undergoing first CE-marking under MDR and has not yet been commercialised. The legacy predecessor device has been on the market since 2020, CE-marked under the Medical Device Directive (MDD) as a Class I device, and remains on market under MDR Article 120(3) transition provisions.

Expected lifetime​

The expected operational lifetime of the device is established at 5 years, which is subject to regular software updates and the lifecycle of the integrated components and platforms. The lifetime will be increased in equivalent spans as the design and development continues and maintenance and re-design activities are carried out.

This timeline accounts for the expected evolution of the underlying operating systems and tools, the progression of medical device technology, and the necessary update cycles to maintain security and operability.

Degree of Novelty​

Clinical or surgical procedure novelty dimensions​

DimensionIs there novelty?Description
Mode of use or Treatment optionYesThe novelty in the Mode of Use is that the assessment of skin lesions is performed using a photograph analyzed by an AI-powered device, rather than solely by visual inspection. Additionally, it enables the primary care practitioner to assess skin conditions with higher accuracy. This changes how the diagnostic assessment is conducted. This new mode of use enables a novelty in the Treatment/Management Option. By providing a reliable, non-invasive analysis for benign pathologies, the device introduces a clinical pathway to enable the option to avoid a biopsy, replacing an invasive procedure with a non-invasive one.
Device-Patient InterfaceNoNot novel — No direct interface with the patient. The interface is through digital image capture, as in common practice. Additionally, the patient is not the intended user of the device.
Interaction and ControlYesThe novelty in Interaction and Control lies in shifting the diagnostic process from a purely human assessment to an interaction between the practitioner and an artificial intelligence analysis. This complete workflow creates a new mode of use for dermatological diagnosis, which is particularly useful in remote or primary care settings.
Deployment MethodsNoThe software is deployed on standard platforms (e.g., mobile devices, web) and integrated into typical clinical workflows.
Clinical WorkflowYesThe novelty in the Clinical workflow is that the device assists the practitioners in the decision-making process. By processing skin images, the device provides the physician with additional clinical information that allows them to make a diagnosis more quickly and decide whether to refer the patient to specialized care or, alternatively, monitor the patient in primary care, thereby reducing workload and waiting lists.

Device-related novelty dimensions​

DimensionIs there novelty?Description
Medical PurposeYesThe primary medical purpose of the device is to assist practitioners, as clinical decision support, in the assessment and severity characterisation of a range of dermatological conditions. However, a novel medical purpose is established by its specific application to addressing previously unmet medical needs. The device is designed and validated to aid in the assessment of rare dermatological conditions (such as Generalised Pustular Psoriasis, Pemphigus Vulgaris or Palmoplantar pustulosis). In this context, where reliable and objective decision-support tools are scarce, applying this technology constitutes a novel medical purpose.
DesignYesThe novelty in the Design of the device lies in its algorithms, which have been trained to allow physicians to obtain additional information about the suspected diagnosis, the severity of the disease, and whether or not to prioritize a patient for referral.
Mechanism of ActionNoNot novel — The device uses AI for image analysis and quantification, but these are established methods in dermatological software.
MaterialsNoNot applicable — Software only.
Site of ApplicationNoNot novel — The software analyzes dermatological images; no direct contact or application to the patient.
ComponentsYesThe novelty in the Components of the device is its proprietary artificial intelligence (AI) algorithm. This software component is integral and necessary for the device's function, performing the analysis of clinical images to quantify the severity of the skin condition. The novelty of this component lies in its unique architecture and the fact that it has been custom-trained on a curated dataset of dermatological images to achieve the intended clinical performance for its specific medical purpose. While the device operates on non-novel hardware (a standard smartphone), the algorithm itself constitutes the core innovative component.
Manufacturing ProcessNoNot novel — Developed using standard software development and validation processes, including lifecycle and risk management.

Novelty conclusion​

From a clinical perspective, the device introduces moderate novelties and moderate clinical impact in dermatological practice, as it:

  • The device offers a new methodology to assess skin conditions, rather than solely use visual inspection, it enables HCPs to have additional clinical information, improving diagnostic accuracy and the decision-making process, therefore it improves the clinical workflow.
  • The device is made up of novel algorithms that allow it to address the needs of physicians that were previously unmet in clinical practice and offers a new tool to obtain information.

On the other hand, it does not introduce new treatment or diagnostic approaches and does not create a new category of medical intervention. Thus, the device provides practitioners with a new tool for assessing skin conditions, in addition to those currently available in clinical practice (visual examination, use of a dermatoscope, or invasive procedures such as biopsies), improving decision-making and so that clinical workflow. However, the device's innovation is moderate, as it does not modify any standard clinical practice procedures in dermatology or offer new treatments. Furthermore, the algorithms it uses meet the needs of professionals, but their architecture and the way dermatological images are captured for processing are procedures used in medical devices used in dermatology.

Clinical Performance Claims​

In order to assess compliance with specific requirements on performances (GSPR 1), the clinical evaluation report will notably have to assess whether the device under evaluation achieves the performance intended by its manufacturer.

The following table provides a summary of the intended clinical performances, the outcome measures, and the claimed acceptance criteria. For the full description of all performance claims, refer to the Clinical Performance Claims section of the Clinical Evaluation Report (R-TF-015-003).

Pooled performance-metric methodology​

Aggregate performance metrics (the overall clinical benefit goals across different subsets) are calculated using the following weighted average formula:

∑(achievedValue×sampleSize)∑(sampleSize)\frac{\sum(\text{achievedValue} \times \text{sampleSize})}{\sum(\text{sampleSize})}∑(sampleSize)∑(achievedValue×sampleSize)​

Pooling is pillar-restricted. Pillar 2 metrics are pooled only with Pillar 2 metrics — the device's stand-alone analytical outputs measured against curated input-output pairs, without a clinician in the loop. Pillar 3 metrics are pooled only with Pillar 3 metrics — clinician+device outputs measured at the mandated user-interface decision point (Top-5 prioritised differential view, malignancy-prioritisation gauge, referral recommendation) on real patients (or, for supporting Rank 11 evidence, on MRMC simulated-use image sets). Cross-pillar pooling is prohibited. Where a single study generates both Pillar 2 and Pillar 3 endpoints (for example, IDEI_2023 or NMSC_2025), each endpoint is attributed to its respective pillar pool and never to both.

The populations of the pooled studies were homogeneous, representing the target population in both primary care and dermatology clinical consultations, ensuring results are applicable and generalizable to the intended real-world clinical practice.

Sample-size-weighted pooling is applied as a conservative aggregation. A bivariate random-effects sensitivity analysis (Reitsma) is reported in the CER for the 7GH(c) multi-malignancy pool to confirm that the device-level acceptance criterion is met under the more rigorous diagnostic-test-accuracy meta-analytic model recommended by the Cochrane Handbook for Diagnostic Test Accuracy Reviews.

Risk management​

Risk management was performed as part of the development process. The risk assessment was conducted in accordance with the 2017/745 MDR and the international standard ISO 14971:2019 Medical Devices-Application of Risk Management to Medical Devices.

Risk management and clinical evaluation are interlinked at many levels: The clinical evaluation shall consider data from risk management activities related to our device as input data for defining relevant safety parameters. Also, any unacceptable residual risk from the risk analysis must be specifically assessed in the CER to provide supporting evidence that the clinical benefits outweighing the residual risk. Finally, clinical evaluation and its periodic reviews and updates is a relevant source of input data for the maintenance and corresponding review and update of the product's risk analysis

StandardTitle
EN ISO 13485:2016/A11:2021Medical devices. Quality management systems. Requirements for regulatory purposes
EN ISO 15223-1:2021Medical devices. Symbols to be used with medical device labels, labeling and information to be supplied. Part 1: General requirements.
EN ISO 20417: 2021Medical devices - Information to be supplied by the manufacturer.
EN ISO 14971:2019Medical devices. Application of risk management to medical devices.
EN ISO 20417:2021Medical devices - Information to be supplied by the manufacturer
UNE-EN 62304:2007/A1:2016 (EN 62304:2006/A1:2015)Medical device software - Software life-cycle processes
UNE-EN 62366-1:2015/A1:2020 (EN 62366-1:2015/A1:2020)Medical devices - Part 1: Application of usability engineering to medical devices
IEC 82304-1:2016Health software - Part 1: General requirements for product safety

Device-specific hazards applicable to the device​

The device is a software-only MDSW; the device-specific hazards identified through the risk-management process are enumerated in the Safety-endpoints table of this CEP (see "Safety endpoints") and in the Risk Management File (R-TF-013-002). The clinical evaluation addresses each device-specific hazard through the pre-specified safety outcome measures and acceptance criteria in that table.

Generic-device hazards that are not applicable to a software-only MDSW (for example, incompatibility with physical consumables or accessories, maintenance-related hazards, and product-lifetime hazards arising from hardware degradation) are documented as not-applicable in R-TF-013-002 and are not addressed separately in this CEP.

The clinical evaluation addresses the device-specific hazards through assessment of the relevant clinical evidence and confirms whether the observed event rates remain at or below the RMF-anchored and SotA-benchmarked acceptance criteria declared in the "Safety endpoints" section.

Risk mitigation measures​

The proposed clinical evaluation strategy, based on a cross-functional approach (Literature, Legacy Data, and Clinical Investigations), ensures that the clinical evidence is sufficient in quality and quantity to demonstrate conformity with GSPR 1, 8, and 17, as required by MDR Article 61(1) and MDCG 2020-6 § 6.4. A primary objective of this evaluation is the clinical validation of the Risk Management File (R-TF-013-002), ensuring a continuous link between clinical data and risk assessment as mandated by Annex I, GSPR 3. The clinical data analyzed during this process will be used to:

  1. Validate Risk Estimations: Confirm that the estimated probabilities of occurrence and severity of hazards remain accurate based on actual clinical use, fulfilling the requirements for "continuous update of the risk management" per Annex I, GSPR 3(e).
  2. Detection of New Risks: Monitor for any emerging risks, unforeseen side effects, or deviations in safety profiles to ensure that no new hazards arise, as specified in MDR Article 61(1) and Annex XIV Part B (PMCF).
  3. Verification of Mitigation Effectiveness: Provide objective clinical evidence that the implemented risk control measures are effective in practice, ensuring that all residual risks are reduced "as far as possible" (AFAP) and remain acceptable when weighed against the clinical benefits (MDR Annex I, GSPR 4 and GSPR 1).

Upon completion of this evaluation, the results will be integrated into the final Clinical Evaluation Report (R-TF-015-003) and used to conclude the "Verification of effectiveness" within the Risk Management Record, ensuring alignment with ISO 14971:2019 and with the MDCG 2020-5 equivalence framework applied to the legacy predecessor.

Risk IDHazardHazardous Situation or Vulnerability
R-2TPThe endpoints of the device are not compatible with the user's softwareThe care provider's IT personnel must develop custom code, which in some cases may not be viable.
R-A96Incompatibility in classification systemsMismatch between the name or code of the ICD class of the medical device and the ones used by the healthcare provider's software
R-HBDMisrepresentation of magnitude returned by the deviceThe care provider's system represent a value as if was representing a different magnitude.
R-BDRMisinterpretation of data returned by the deviceThe care provider's system represent a value as if was representing a different clinical endpoint.
R-75HIncorrect clinical informationThe care provider receives into their system data that is erroneous
R-DAGIncorrect diagnosis or follow upThe medical device outputs a wrong result
R-SKKIncorrect results shown to patientThe patient see erroneous results.
R-D1IUnauthorized patient access to clinical dataThe patient somehow manages to get access to the clinical endpoints of the device.
R-AGQImage artefacts or poor resolutionThe medical device receives an input that does not have sufficient quality in a way that affects its performance
R-E7ZInaccessible skin areasThe device cannot analyse certain skin areas
R-T8QData transmission failure from healthcare provider's systemThe healthcare provider's system cannot connect to the medical device
R-3N5Data input failureThe medical device cannot receive data from healthcare providers' system
R-YF4Data accessibility failureThe healthcare provider cannot receive data from the medical device
R-LRPData transmission failureThe medical device cannot send data to healthcare providers
R-MWDInterruption of serviceThe device or the healthcare system experiences an unexpected interruption in service leading to inability to use the device
R-TLMAn organisation that is not a licensed healthcare provider gets access to the deviceImproper use of the device and improper use of the outputs of the device
R-4GGUsers outside the intended user definition use the medical deviceOther personnel (other than HCP and ITP) directly interact with the medical device
R-ZFRThe device is not used under the supervision of an HCPImproper use of the device and improper interpretation of the outputs of the device
R-CYOThe device is integrated by unqualified ITPsMedical device communication with the user server is not properly established
R-QLFNon-compliance with the General Safety & Performance Requirements (GSPR)Inadequate safety and performance of the whole device
R-ES8Non-compliance with GSPR 3 (absence of a risk management process)Risks are not mitigated as far as possible
R-EZZInstructions for use not available or separate from the productWhole device cannot be used
R-CGQInadequate specification of the product intended purposeWhole device is wrongly used or is not used as intended
R-TA9Inadequate camera usage or settingsPoor image quality due to inadequate resolution, lighting, focus or camera settings
R-3YJData breach or unauthorized accessUnauthorized persons have access to confidential data
R-C6QNon-compliance with GSPR 3 (absence of a PMS & PMCF process)Unavailability of safety, performance, usability information during product usage needed to improve the device
R-8KSInadequate instructions for use: product information for clinical safety is not included at the IFUUse of the device without the necessary safety-related information
R-UI5Inadequate instructions for use: product information for cybersecurity is not included in the IFUPresence of vulnerabilities that may compromise the integrity of the system and patient data
R-5L4Inadequate lighting conditions during image captureThe medical device receives an input that does not have sufficient quality
R-U6MSystem incompatibilityIntegration of our device is not compatible with the user platform
R-OM1Data overwriteCritical patient data, such as medical images or diagnostic results, is unintentionally replaced or corrupted
R-B63Inconsistent or unreliable outputAnalysis of the same image generates different results when using the same version of the device
R-RAJSensitivity to image variabilityAnalysis of the same skin structure with images taken with deviations in lightning or orientation generates significantly different results
R-2S3Integration failure or errorsFailure to communicate with other systems
R-GY6Inaccurate training dataImage datasets used in the development of the device are not properly labeled
R-7USBiased or incomplete training dataImage datasets used in the development of the device are not properly selected
R-1OCLack of efficacy or clinical utilityThere are no demonstrated product clinical benefits when used as intended by the manufacturer
R-VL1Device failure or performance degradationThe device is overwhelmed by its use: either not enough storage capacity or unable to handle requests
R-HAXIncorrect interpretation of device outputsThe HCP validates the wrong skin condition, even if the device outputs the correct result
R-TBNNon-compliance with GSPR 23: Inadequate labelInsufficient label information to understand the device intended use, version
R-L38Non-compliance with GSPR 23: Inadequate Instructions for UseIntegration cannot be properly performed
R-O5YComplicated instructions for use: the instructions for use are too complicated and more intricate than they need to beMisinterpretation of IFU
R-UK2Inadequate warnings in the IFULack of critical safety information required for the correct use of the device
R-27MInadequate maintenance performed by the manufacturerDevice performance is compromised
R-046Inadequate or absent maintenance specifications, including performance checksDevice performance is compromised
R-7GCInadequate maintenance: users do not properly maintain the deviceDevice performance is compromised
R-3OGAbsence of limitation of product lifetimeUser does not know the lifetime of the device to stop using it
R-G3VProduct requirements are not defined (user, technical and regulatory)Whole device is wrongly used / is not used as intended
R-GTYInstructions for use are not available at the time of use due to downtimeUser cannot consult the IFU
R-X93The device receives images that do not represent skin structureThe device provides an incorrect diagnosis based on irrelevant or non-clinical input
R-HH0The electronic data and content are tamperedMedical device's outputs are tampered
R-109Electronic instructions for use are not compatible with different devicesIntended user cannot consult IFU
R-4Z5Lack of version control or traceabilityThe ITP cannot identify the version of the device being used
R-72DSOUP presents an anomaly that makes it incompatible with other SOUPs or with software elements of the deviceThe overall performance of the device is compromised
R-MQ1SOUP is not being maintained nor regularly patchedOverall degradation of device's performance
R-9SSSOUP presents cybersecurity vulnerabilitiesThe SOUP can be attacked and corrupted causing device failure as it may have known vulnerabilities that could be exploited by malicious actors.
R-75LStagnation of model performanceThe AI/ML models of the device becomes outdated or stagnates due to lack of continuous updates, retraining, or adaptation to new clinical data
R-PWKDegradation of model performanceAutomatic re-training of models decreases the performance of the device
R-BXDInsufficient knowledge to display electronic IFUFail to properly display the instructions for use
R-33BElectronic IFU are tamperedIncomplete or incorrect information being provided to the users
R-ZNAElectronic IFU and their paper copies are unavailableFail to follow instructions for use to integrate the medical device
R-K6LNon-compliance with MDCG 2023-4 (software does not operate correctly with all intended hardware configurations - cameras)Poor image quality due to inadequate resolution, lighting, focus or camera settings

Risk Summary​

Total identified risks​

The device has 62 distinct risk IDs mapped to 5 major hazard categories, catalogued in the Risk Management File (R-TF-013-002) and summarised in the Risk Management Report (R-TF-013-003).

Risk CategoryCountMitigation StrategyVerification Method
Clinical Performance and AI/ML Risks11 (R-75H, R-DAG, R-SKK, R-B63, R-RAJ, R-GY6, R-7US, R-1OC, R-HAX, R-75L, R-PWK)User training, human-in-the-loop workflow, explainability metrics, representative dataset selection, controlled retraining processClinical investigation validation, adverse event monitoring, post-market clinical follow-up
Image Quality Risks6 (R-AGQ, R-E7Z, R-TA9, R-5L4, R-X93, R-K6L)Automated image quality assessment algorithm, user guidance and training, IFU imaging instructionsImage quality scoring validation, real-world usability testing, summative evaluation
System Integration and Interoperability Risks13 (R-2TP, R-A96, R-HBD, R-BDR, R-T8Q, R-3N5, R-YF4, R-LRP, R-MWD, R-U6M, R-OM1, R-2S3, R-VL1)FHIR and ICD-11 standard compliance, error handling, redundancy controls, elastic infrastructure, REST protocol, SOUP managementSystem integration testing, connectivity validation, summative evaluation
Cybersecurity Risks7 (R-D1I, R-3YJ, R-UI5, R-HH0, R-MQ1, R-9SS, R-33B)Authentication controls (OAuth/JWT), data encryption (SSL/TLS), SOUP monitoring and patching, cybersecurity IFU guidance, GPG-signed commits for IFU integrityPenetration testing, cybersecurity audits, SOUP vulnerability reviews
Regulatory, Labeling, and IFU Risks25 (R-TLM, R-4GG, R-ZFR, R-CYO, R-QLF, R-ES8, R-EZZ, R-CGQ, R-C6Q, R-8KS, R-TBN, R-L38, R-O5Y, R-UK2, R-27M, R-046, R-7GC, R-3OG, R-G3V, R-GTY, R-109, R-4Z5, R-72D, R-BXD, R-ZNA)Compliance with MDR 2017/745 and harmonized standards, IFU developed per ISO 15223-1, QMS processes (GP-012, GP-013), PMS and PMCF plans, maintenance procedures, eIFU accessibility via webInternal and external audits, summative evaluation, customer feedback monitoring

Risk mitigation effectiveness​

All individual residual risks and the overall residual risk (per ISO 14971:2019 §8) have been determined acceptable against the risk acceptability criteria defined in R-TF-013-003 Risk Management Report. Risk control measures will be verified during clinical investigations.

Safety endpoints​

In accordance with Section 1.a of Annex XIV of Regulation (EU) 2017/745 on medical devices, this Clinical Evaluation Plan specifies the methods to examine both qualitative and quantitative aspects of clinical safety, with particular focus on the identification and assessment of residual risks and potential side-effects. The subsequent Clinical Evaluation will summarize complication and adverse event rates generated during device validation, and, taking into account measurement uncertainties and clinical variability, may recommend updates to the Risk Management File or the introduction of corrective and preventive actions, which will be documented in both the final Clinical Evaluation Report and the post-market surveillance process. All residual risks recorded in the Risk Management File (R-TF-013-002) will be reviewed from a clinical perspective, since any such risk could harm patients or users and must therefore be addressed during the device's clinical validation activities.

During the clinical evaluation, the verification and validation of the effectiveness of the risk control measures identified in the risk management file will be confirmed under actual conditions of clinical use, as required by Annex XIV, Part A, Section 1(d) of Regulation (EU) 2017/745 (MDR), and in line with the guidance provided in MDCG 2020-6. From these residual risks we have derived specific safety objectives, each aligned with the corresponding identified residual risk, ISO 14971:2019 (Clauses 7.3-7.4 and 8), the General Safety and Performance Requirements of MDR Annex I, and the clinical evidence provisions of Article 61 of the MDR. For Significant Residual Risks, we mean those risks that, though acceptable, have a true clinical risk impact on patients and users, and therefore require specific clinical validation to ensure that the benefits of the device outweigh these risks. For each safety objective, we have defined an outcome measure and an endpoint/acceptance criterion to be used during the clinical evaluation.

Dual anchoring of safety acceptance criteria​

Each acceptance criterion is anchored to two thresholds, both of which must be satisfied:

  1. Internal RMF threshold: The observed event rate during the clinical evaluation shall be less than or equal to the residual probability of occurrence specified in the Risk Management File (R-TF-013-002) for the corresponding risk. Each safety row in the table below carries the exact numeric RMF residual-probability value for its risk ID.
  2. External SotA baseline: In parallel, the observed event rate is benchmarked against the corresponding published SotA failure-mode baseline derived from a structured search of vigilance registries (FDA MAUDE, EUDAMED) and the peer-reviewed literature on AI-based medical device software in dermatology (the SotA benchmarking table for safety endpoints is reported in R-TF-015-003 Clinical Evaluation Report §"Safety Benchmarking against State of the Art"). The observed event rate shall be at or below the SotA baseline.

The two thresholds are complementary: the RMF threshold is the internal acceptability bound under ISO 14971; the SotA baseline is the external comparability bound under MEDDEV 2.7.1 Rev 4 §A7.4 and MDCG 2020-6 §6.3. Reporting under both bounds avoids the circularity that would arise if safety acceptability were defined solely by reference to the manufacturer's own RMF figures. Where the observed event count is zero, the rule of three is used to compute an upper one-sided 95% confidence bound on the true event rate, and the upper bound is then compared against both the RMF threshold and the SotA baseline (calculation reported in the CER).

The following table summarises the safety objectives, their associated risks, potential harms, outcome measures, the RMF-anchored acceptance criteria, and the external SotA baseline derivation reference that the observed event rate is also benchmarked against. The CER reports each observed event rate against both bounds.

Where a SotA baseline is flagged as "to be quantified in the first annual CER update" (because the current vigilance denominator is insufficient to derive a quantitative benchmark at the point of certification), we apply the following discipline for each such row: (a) the structured literature and vigilance-registry search already performed to attempt quantification is documented in the Safety Benchmarking section of R-TF-015-003, together with the reason the current corpus does not yield a usable denominator (sparse vigilance reporting on AI-dermatology decision support, absence of AI-specific hazard taxonomy in the relevant registries, or insufficient peer-reviewed post-market studies on comparator devices); (b) the refresh trigger for the SotA baseline is pre-specified — any new FDA MAUDE accrual on a comparator decision-support device, any new peer-reviewed AI-dermatology vigilance study, or any new EUDAMED data release triggers a re-derivation attempt at the next literature-search refresh; (c) the RMF-anchored bound applies exclusively until the SotA baseline is quantified, with the observed event rate (including the rule-of-three upper one-sided 95 % bound where event count is zero) reported against the RMF threshold alone; (d) once quantified, if the observed event rate (or its rule-of-three upper bound) exceeds the SotA baseline, an unscheduled CER update and benefit-risk re-review is triggered per the unscheduled-update triggers in "Mechanism for Future Updates".

Safety objectiveRisk IDIdentified Residual RiskPotential harm to the patientOutcome MeasuresRMF-anchored acceptance criterionSotA baseline (derivation reference)
Specify in the intended purpose of the device that is a support tool, not a diagnosis one, meaning that it must always be used under the supervision of HCPs, who should confirm or validate the output of the device considering the medical history of the patient, and other possible symptoms they could be suffering, especially those that are not visible or have not been supplied to the deviceR-75HThe care provider receives into their system data that is erroneous.Misdiagnosis or delayed diagnosis or inadequate prioritisation, leading to inappropriate or unnecessary treatment, progression of the underlying condition, or adverse events from incorrect therapy.Observed event rate of incorrect clinical-information outputs per use, with 95 % upper one-sided bound where event count is zero (rule of three).≤ 0.01 % per use (R-75H RMF residual probability)FDA MAUDE query for comparator CAD/CADx dermatology devices, 2020–2025 refresh cadence, "incorrect output / wrong result" hazard; full query and numerator / denominator documented in the Safety Benchmarking section of R-TF-015-003. If the SotA baseline cannot yet be quantified at the first CER issue (insufficient denominator in public vigilance registries), it is stated as "to be quantified in the first annual CER update" and the RMF bound applies exclusively until then.
Demonstrate that the frequency of device-related diagnostic errors and their downstream clinical consequences are lower than that defined in its intended useR-DAGThe medical device outputs a wrong resultMisdiagnosis or delay in diagnosis, prioritisation or inappropriate clinical management (e.g. unnecessary tests or treatments) and worsening patient's health status.Observed rate of device-output wrong-result events per use, with rule-of-three 95 % upper bound where count is zero.≤ 0.01 % per use (R-DAG RMF residual probability)FDA MAUDE + EUDAMED query for comparator CAD/CADx dermatology devices on "incorrect classification / diagnostic-support error" hazard; derivation reference in R-TF-015-003 Safety Benchmarking. To-be-quantified clause as above if vigilance denominator insufficient.
Image acquisition without interferences or artifactsR-AGQThe medical device receives an input that does not have sufficient quality in a way that affects its performanceDelay in consultations to take an image with an acceptable quality. Misdiagnosis, delays in treatmentObserved rate of insufficient-quality inputs incorrectly accepted per use.≤ 0.1 % per use (R-AGQ RMF residual probability)Published DIQA false-accept rates in peer-reviewed AI-dermatology image-quality evaluation (supplementary SotA search); derivation reference in R-TF-015-003 Safety Benchmarking. To-be-quantified clause as above if literature denominator insufficient.
System interoperability: To detect and minimise failures in connection and bidirectional data transmission that result in data being inaccessible to clinicians, and to quantify any resulting delays or omissions in patient management and care.R-T8Q, R-3N5, R-YF4 and R-LRPFailure of interoperability between the medical device and the healthcare provider's system, resulting in an inability to establish a connection or perform bidirectional data exchange.Delayed or missed diagnostic support leading to postponed or inappropriate clinical decisions, potentially worsening patient outcomes.Observed rate of system-interoperability failures per use, with rule-of-three bound where count is zero.≤ 0.1 % per use (maximum of R-T8Q, R-3N5, R-YF4, R-LRP RMF residual probabilities)Published HL7 FHIR interoperability-failure rates and vigilance-registry queries on "failure of interoperability" hazard for comparable CE-marked / FDA-cleared decision-support systems; derivation reference in R-TF-015-003 Safety Benchmarking.
Ensure that only images meeting the predefined illumination criteria are processed for diagnostic support and quantify the impact of sub-standard lighting on device performance and clinical outcomes.R-5L4The medical device receives an input that does not have sufficient qualityInterferences in the device's performance that can lead to misdiagnosis, delays in proper treatment and worsening of the patient's health status.Observed rate of insufficient-quality lighting inputs reported per use.≤ 0.1 % per use (R-5L4 RMF residual probability)Published image-acquisition-lighting-failure rates in peer-reviewed AI-dermatology literature and teledermatology workflow studies; derivation reference in R-TF-015-003 Safety Benchmarking. To-be-quantified clause as above if literature denominator insufficient.

These measurable endpoints ensure that safety is rigorously assessed and comparable to or exceeding the state of the art.

While these risks are mitigated through technical and procedural controls, Post-Market Surveillance (PMS) will monitor any potential occurrences post-market.

For subsequent revisions of this Clinical Evaluation Plan, this assessment will be completed through monitoring and assessment of the complication rates collected during PMS and PMCF activities, compared against the theoretical residual probability of occurrence defined in the Risk Management File (RMF) for each applicable risk.

By mapping each safety objective directly to the regulatory and standardised requirements, we ensure full compliance and robust justification for the clinical validation of residual risk controls.

Acceptability of the benefit-risk ratio​

The benefit/risk ratio for the device shall be deemed acceptable as long as:

  1. All the applicable safety-related GSPRs are met based on a critical assessment of the relevant safety parameters defined in the previous section providing a good safety profile for the device;
  2. All the performance-related GSPRs are met, based on the outcome parameters defined in the previous Section and as assessed through a critical revision of relevant clinical evidence.
  3. Any residual risks with clinical impact as identified in the risk assessment of the device are adequately addressed.
  4. All the clinical benefits are fulfilled and their benefits are superior to the residual risks identified in the risk assessment.
  5. PMS and PMCF activities are planned to keep monitoring and assessing the risks and side-effects once the device is on the market.

Benefit-risk determination methodology​

The benefit-risk determination methodology for each of the three declared clinical benefits (7GH, 5RB, 3KX) follows the quantification expectations of MEDDEV 2.7/1 Rev 4 §A7.2 (magnitude, variation, clinical relevance and proportion of responders) and the SotA-comparability expectations of MDCG 2020-6 §6.3 and §6.4. For each benefit:

  1. The pooled device-level achieved value for the benefit's acceptance criterion is computed as declared in "Pooled performance-metric methodology" above, pillar-restricted (Pillar 2 metrics pool only with Pillar 2 metrics; Pillar 3 metrics pool only with Pillar 3 metrics; cross-pillar pooling is prohibited).
  2. The achieved value is compared against the per-benefit SotA baseline declared in the "Acceptance-criteria derivation" table and against the clinical-significance margin added on top of the SotA baseline, per the three-stage Extraction → Synthesis → Margin workflow.
  3. The benefit is deemed acceptable if the pooled achieved value meets or exceeds the SotA-anchored acceptance criterion at the pillar-appropriate target (Pillar 2 stand-alone analytical target or Pillar 3 clinician-in-the-loop target).

Residual risk for each benefit is quantified per the Safety Endpoints section and is dual-anchored against (i) the internal RMF threshold under ISO 14971:2019 and (ii) the external SotA baseline from vigilance-registry and peer-reviewed literature derivation; the rule-of-three upper one-sided 95 % confidence bound is applied where the observed event count is zero. The residual risk is accepted if the observed event rate (or its rule-of-three upper bound) is at or below both the RMF threshold and the SotA baseline.

The overall benefit-risk ratio for the device is acceptable if and only if (a) each declared benefit is acceptable under (1)–(3) above at the pillar-appropriate target, and (b) each applicable residual risk is acceptable under the dual-anchored safety methodology. The per-benefit and per-risk quantitative determination and the overall conclusion are reported in the Clinical Evaluation Report (R-TF-015-003).

This Clinical Evaluation Plan defines how the benefit-risk ratio of the device will be evaluated against the criteria above, including the review of the Post-market Surveillance Plan and Risk Management documentation. The benefit-risk conclusion itself is reported in the Clinical Evaluation Report (R-TF-015-003).

State of the art​

Scope​

The State of the Art document is established within the framework of the clinical evaluation of the device. It specifies the clinical background and current knowledge, and establishes the state of the art for current clinical practice and for AI-based medical device software in dermatology.

The current state of the art in the corresponding medical field, the following aspects and information will be checked:

  • Applicable standards and guidance documents.
  • Information relating to the current situation in the medical field in which the device is used.
  • Benchmark devices and other devices available on the market.

The literature search must be based on a literature search protocol.

The literature search protocol used for the review of the state of the art, together with the results of this literature review, is presented in the separate document R-TF-015-011 State of the Art.

This separate document was created to avoid duplication as the current knowledge/state of the art should be present in both the clinical evaluation plan and report.

Thus, this section aimed to present the literature search protocol used for the identification of clinical data on the device under evaluation and the legacy device.

Literature search​

Literature search protocol​

In accordance with the section A5 of the MEDDEV 2.7/1 rev4 guide, the expression of the objective of the literature search will use the PICO methodology (Problem/patient/population, type of Intervention, Comparator, and relevant Outcomes).

The selection of relevant articles from references identified on the databases is based on the description of the research objective presented in the following sections.

  • Patient
    • Inclusion: Patients with visible skin structure abnormalities; skin diseases listed in ICD-11 code 14; across all age groups, skin types, and demographics. Users: Healthcare Professionals (HCPs) such as dermatologists, General Practitioners (GPs) and IT professionals.
    • Exclusion (wrong type of population): animals; studies focused on non-dermatological pathologies.
  • Intervention/indicator
    • Inclusion: Use of a computational software-only medical device (SaMD) that processes images of skin structures to provide clinical data for aiding practitioners in skin assessments.
    • Exclusion: Interventions not related to the device's intended use or medical indication.
  • Comparator and type of studies
    • Inclusion: other smartphone applications (SkinVision, Molescope, Huvy, DERM); traditional methods of clinical skin examination without software assistance; non-software-based skin assessments by healthcare professionals (Standard of Care).
    • Type of studies: meta-analyses; literature and systematic reviews; case series and cohort studies; clinical studies (randomised or non-randomised, multicentric or not, prospective or retrospective); clinical guidelines and guidelines elaborated by scientific societies.
    • Exclusion (wrong comparator and studies):
      • Non-clinical comparators (e.g., comparison against another algorithm only).
      • Purely in silico or in vitro validation studies without clinical practice data.
      • Case reports that do not provide new information on risks or performance.
      • Non-peer-reviewed literature (e.g., opinion articles, blog posts).
      • Study providing no clinical results (e.g. protocols).
  • Outcomes
    • Inclusion: Improved efficiency and accuracy in clinical decision-making for skin disease assessment or malignancy detection; support in diagnosis through interpretative data and quantification. Optimisation of clinical workflow through reduction of unnecessary referrals from primary care to dermatology; reduction of cumulative waiting time to see the dermatologist face-to-face. Safety data (e.g. incorrect performance, failure of interoperability, inputs without sufficient quality).
    • Exclusion (wrong objectives): non-clinical outcomes (e.g., technical algorithm testing); datasets not discussing the correct use, safety, performance, or benefits of the device; data focused only on drugs; overly specific topics (datasets on a particular subject deemed irrelevant to the state of the art).

Source of data and search description​

As per MEDDEV 2.7/1 rev.4 guidance document, a comprehensive search strategy normally involves multiple scientific databases. The literature-search programme of this clinical evaluation operates on two parallel streams, in line with MDCG 2020-1 Pillar 1 (Valid Clinical Association), MDCG 2020-6 §6.3, and MEDDEV 2.7.1 Rev 4 Annex A5:

  1. Stream A (State of the Art and comparator-device search): Performed and maintained in R-TF-015-011 State of the Art. This stream covers the full SotA literature on AI-assisted dermatological image analysis, comparator devices (SkinVision, Molescope, Huvy, DERM, Dermalyser and other CE-marked or FDA-cleared smartphone dermatology tools), standard-of-care dermatology, severity-scoring scales, and field-wide population-representativeness evidence. The Stream A search uses multiple databases — MEDLINE/PubMed, Embase (via Ovid), Cochrane Library, Google Scholar, ClinicalTrials.gov, FDA MAUDE, FDA Medical Device Recalls and EUDAMED — with a structured PICO query set, PRISMA flow reporting, and CRIT1-7 appraisal. The full Stream A search protocol, queries, screening, results, and CRIT1-7 scores are documented in R-TF-015-011. This stream is the regulatory backbone of MDCG 2020-1 Pillar 1, of MDCG 2020-6 §6.3 SotA appraisal, and of MEDDEV 2.7.1 Rev 4 Annex A5.
  2. Stream B (Manufacturer-specific search): Targeted search for clinical publications that report on the device under evaluation or on its legacy predecessor. Stream B is a complement to Stream A, not a substitute for it. Stream B applies the same PICO inclusion/exclusion criteria declared at the start of this section (Patient, Intervention/indicator, Comparator and type of studies, Outcomes) and the same three-stage screening (title screen → abstract screen → full-text screen) with PRISMA-style flow reporting. Stream B uses the manufacturer's product-brand and legal-entity identifiers as the keyword set; the dual-term query (product brand together with legal-entity name) is standard literature-search practice for capturing all manufacturer-associated publications, regardless of whether the authors affiliate with the brand or the legal entity. The literal keyword strings, search dates and records retrieved are recorded in R-TF-015-011 State of the Art, section "Manufacturer-specific search (Stream B): keyword register" — the reproducible reference any future re-run or audit should use; the table below in this CEP shows the generic methodology and the aggregate record counts.

Stream B operates on a narrower database set than Stream A (MEDLINE/PubMed, Google Scholar and ClinicalTrials.gov, rather than the full Stream A set that additionally includes Embase via Ovid, Cochrane Library, FDA MAUDE, FDA Medical Device Recalls and EUDAMED). The rationale is that Stream B is a manufacturer-identifier search: its keyword set consists of the manufacturer's product-brand and legal-entity strings, and these identifier terms are indexed most completely and reproducibly in MEDLINE/PubMed (biomedical citation coverage), Google Scholar (broad indexing including European journals and grey literature) and ClinicalTrials.gov (device-specific registered investigations). Embase and Cochrane do not materially add sensitivity for manufacturer-identifier queries within the scope of a Class IIb decision-support MDSW: Cochrane is a review-synthesis database and would duplicate captured reviews already retrieved via Stream A; Embase coverage for manufacturer-identifier strings is redundant with MEDLINE/PubMed for this device's corpus. A one-off Embase sensitivity check on the manufacturer-identifier keyword set has been run at the 2025 refresh and is documented in R-TF-015-011 section "Stream B database-selection sensitivity check"; no additional records exceeding the Stream A / Stream B aggregate were identified. FDA MAUDE, FDA Medical Device Recalls and EUDAMED are vigilance databases covered under Stream A and the "Vigilance databases" subsection below. Stream B databases are therefore:

  • MEDLINE PubMed: comprises over 30 million citations for biomedical articles from MEDLINE, PMC, life-science journals and online books; the most pertinent database for biomedical manufacturer-identifier searches.
  • Google Scholar: comprehensive coverage of clinical research publications, including those from European journals, with full-text search.
  • ClinicalTrials.gov: an online clinical-trials registry maintained by the U.S. National Library of Medicine, used to capture device-specific registered investigations.

Embase is covered as a standing database in the Stream A protocol documented in R-TF-015-011; Embase coverage of manufacturer-identifier queries is therefore maintained at every Stream A refresh, and Stream B coverage of Embase is assured through the Stream A overlap rather than through a separate Stream B Embase query. An Embase sensitivity check against the Stream B manufacturer-identifier query set is repeated at every refresh cadence (Stream B refresh mirrors Stream A) to confirm that no additional manufacturer-identifier records are returned by Embase beyond those already captured across MEDLINE PubMed, Google Scholar and ClinicalTrials.gov; any additional records surfaced in a refresh sensitivity check are appraised under the same PICO protocol and added to the Stream B appraisal register. This arrangement satisfies MEDDEV 2.7/1 Rev 4 Annex A5's multi-database breadth expectation through the combined Stream A + Stream B coverage and is re-verified at every refresh.

Per MEDDEV 2.7/1 Rev 4, citation chasing is applied to publications identified through either stream: literature found to be relevant is likely to cite other literature of direct interest, which is searched and appraised under the same protocol.

Following the inclusion criteria described above, the Stream B (manufacturer-specific) searches of 15 July 2025 are tabulated below; the Stream A (SotA / comparator) searches are documented in R-TF-015-011. A supplementary Stream A targeted refresh (April 2026) is also documented in R-TF-015-011 and described in the CER's "Supplementary Literature Search: April 2026" section. Stream B is refreshed at the same cadence.

Note on use of manufacturer-identifier strings in the table below: MEDDEV 2.7/1 Rev 4 Annex A5 requires the literature-search protocol to be fully documented and reproducible. Because the Stream B query set is constructed from manufacturer identifiers, the literal product-brand and legal-entity name strings are reproduced verbatim in this section so that the protocol is self-sufficient at CEP level. This reproduces the literal strings already held in the identical register maintained in R-TF-015-011 section "Manufacturer-specific search (Stream B): keyword register" and is an explicit, scope-limited exception to the general QMS convention of avoiding manufacturer / product names in regulatory prose. The exception applies only to Stream B reproducibility.

IDDatabaseKeywords/queryFilterRecords retrievedRecords included after PRISMA screeningStream
01MEDLINE PubMed"Legit.Health"No limitations10See R-TF-015-011 Stream B PRISMA flowStream B (manufacturer-specific)
02Google Scholar"Legit.Health" AND "AI Labs Group"No limitations3See R-TF-015-011 Stream B PRISMA flowStream B (manufacturer-specific)
03ClinicalTrials.gov"Legit.Health"No limitations2See R-TF-015-011 Stream B PRISMA flowStream B (manufacturer-specific)
—See R-TF-015-011PICO queries (Stream A)See protocolSee R-TF-015-011See R-TF-015-011Stream A (SotA / comparator)

The Stream B search was conducted by Jordi Barrachina (Clinical Affairs Manager) on 15 July 2025 as described above and without deviation.

Refresh cadence​

The Stream A search is refreshed at every CER update (annual at minimum) and additionally whenever (a) a new condition or indication is added to the intended use, (b) a new SotA gap is identified through PMS/PMCF, or (c) a new comparator device receives CE marking or FDA clearance for an overlapping indication. The Stream B search is refreshed at the same cadence. Each refresh is documented as an additional search run with its own query table and PRISMA flow.

Duplicates were identified using the unique references of the article (PMID, DOI and Pudmed Identifier). For publications that have no unique identifier, duplicates were identified using mainly the title, the authors and the source of the document.

Articles can also be added manually if they are deemed relevant and consistent with the research objectives. These publications can be identified within the selected articles.

The protocol also provides for supplementary targeted literature searches to be conducted when acceptance criteria are being established for specific clinical metrics or when MDCG 2020-1 Pillar 1 (Valid Clinical Association) requires condition-specific evidence beyond the scope of the initial systematic review. Such supplementary searches are conducted in accordance with the same PICO methodology and inclusion/exclusion criteria as the initial protocol, documented as additional search runs with their own query tables, and appended to R-TF-015-011.

Vigilance databases​

Vigilance-database searches are performed against (a) the legacy predecessor device identifier and (b) the named comparator devices (SkinVision, Molescope, DERM, Dermalyser, Huvy). These searches cover FDA MAUDE, FDA Medical Device Recalls and EUDAMED and are carried out as part of the Stream A SotA and comparator search documented in R-TF-015-011. No additional device-specific vigilance entry exists for the device under evaluation because the device has not yet been commercialised; Stream A supplies the external safety-benchmarking evidence required by MDCG 2020-6 §6.3 until device-specific entries accrue post-launch.

Registries​

During the analysis of the registry reports for the state of the art presentation, specific clinical data relating to the device under evaluation was also sought.

Selection Methodology and Criteria​

A systematic and objective search and review protocol will be implemented to identify relevant clinical data. While this search is designed to be comprehensive, it is acknowledged that some pertinent studies may be inadvertently omitted due to limitations such as search language or publication indexing.

The screening process will be conducted in three sequential stages:

  • An initial review of article titles.
  • A secondary review of abstracts.
  • A final full-text assessment of the materials and methods.

Records are screened independently by two reviewers drawn from the clinical evaluation team; disagreements at any stage are resolved by discussion, with a third reviewer adjudicating any residual disagreement. A PRISMA 2020 flow diagram is produced for each stream (Stream A and Stream B) and retained in R-TF-015-011. Records excluded at each stage are retained with their exclusion reason in the appraisal register as required by MDCG 2020-13 Section D.

At each stage, publications will be included or excluded based on the predefined PICO criteria (detailed in section Literature Search Protocol). The complete selection process will be documented within the clinical evaluation report to ensure full traceability and reproducibility.

Literature appraisal data​

Appraisal plan​

The relevance and weight of all clinical data will be critically assessed to determine their contribution to the evaluation of the device's clinical performance and safety. The appraisal covers two distinct data streams, each with its own methodology:

  • State of the Art (SotA) literature: Appraised using the unified CRIT1-7 framework described below, which scores each publication for methodological quality and clinical relevance relative to the device's intended use context. This framework is based on IMDRF MDCE WG/N56FINAL:2019 and applies to the heterogeneous SotA corpus where relevance against external comparators must be established.
  • Manufacturer's clinical investigations and published manuscripts: Appraised using design-specific validated tools (QUADAS-2 for diagnostic accuracy studies; MINORS for clinical utility, MRMC, and published severity validation studies), as described in the section "Planned appraisal methodology for clinical investigations and published manuscripts" within the Clinical Development Plan. These data sets do not require external relevance scoring because they were designed specifically for the device under evaluation; they are instead assessed for internal methodological quality in accordance with MEDDEV 2.7.1 Rev 4 Section 9 and MDCG 2020-6 § 6.3.

The entire process is conducted in adherence to the manufacturer's pre-established appraisal plan and the methodology outlined in the MEDDEV 2.7/1 rev. 4 guidance.

Appraisal and weighting criteria for the State of the Art literature (CRIT1-7)​

As exposed in MEDDEV 2.7/1 Rev. 4, uncertainty arises from two sources: the methodological quality of the data, and the relevance of the data to the evaluation. Consequently, datasets identified in the SotA literature search have been appraised using the criteria in the table below. These criteria are based on the IMDRF MDCE WG/N56FINAL:2019 (formerly GHTF/SG5/N2R8:2007): Clinical Evaluation.

IDCriteriaDescriptionGrading SystemCriteriaScore
CRIT1Study FocusDo the data relate to a relevant clinical alternative?Direct RelevanceData on a similar device (e.g., devices tagged as similar) OR on the standard clinical practice (e.g., accuracy of HCPs, visual inspection in Primary Care).2
CRIT1Study FocusDo the data relate to a relevant clinical alternative?Contextual RelevanceContextual data (e.g., disease epidemiology, general clinical guidelines) but not on the performance of a specific alternative OR Clinical data including a similar device but which is not specific1
CRIT1Study FocusDo the data relate to a relevant clinical alternative?No RelevanceData not related to any clinical alternative in dermatology0
CRIT2Clinical Setting or Intended useDoes the study's setting and intended use match the device under evaluation?Full matchData focused on devices designed to support healthcare practitioners in the assessment of skin structures OR Same setting (e.g., Primary Care and/or Dermatology clinic).2
CRIT2Clinical Setting or Intended useDoes the study's setting and intended use match the device under evaluation?Partial matchData focused on devices with an intended use not claimed by the manufacturer, but compliant with the intended use of the device group OR Same setting but for a different intended use (e.g., melanoma detection only).1
CRIT2Clinical Setting or Intended useDoes the study's setting and intended use match the device under evaluation?No matchData focused on devices with an intended use not related to the device under evaluation OR Different clinical setting (e.g., specialities different from dermatology).0
CRIT3Population of patientsIs the study population representative?ApplicableTarget population as per the device's intended use (e.g., patients attending a dermatological consultation across all age groups, skin types, and demographics)2
CRIT3Population of patientsIs the study population representative?Partially applicableSpecific sub-population of the target population (e.g., only high-risk patients, only a specific skin phototype, only a pathology).1
CRIT3Population of patientsIs the study population representative?Not applicablePopulation not related to the target population (e.g., healthy volunteers) or non-relevant or contraindicated population.0
CRIT4Type of datasetAppropriate study design/type of document and sufficient dataYesStudies with a level of evidence greater than or equal to 4 (as per Level of Evidence scale)1
CRIT4Type of datasetAppropriate study design/type of document and sufficient dataNoStudies with a level of evidence lower than 4 (e.g., expert opinions, small case series). OR insufficient data to extract relevant clinical performance or safety information.0
CRIT5Outcome measurement (Performance/Safety)Does the study measure objective outcomes related to performance (e.g., diagnostic accuracy) and/or safety (e.g., false negative rate)?YesProvides quantitative performance data (e.g., Sensitivity, Specificity, PPV) and/or safety data (e.g., rate of unnecessary biopsies, false negatives).1
CRIT5Outcome measurement (Performance/Safety)Does the study measure objective outcomes related to performance (e.g., diagnostic accuracy) and/or safety (e.g., false negative rate)?NoDoes not provide performance or safety data (e.g., descriptive only).0
CRIT6Clinical significanceDoes the study evaluate if the performance results in a tangible clinical benefit (e.g., reduction in unnecessary biopsies, improved early detection)?YesProvides clinical benefit data (e.g., impact on referral pathways, reduction of benign biopsies) or workflow benefits.1
CRIT6Clinical significanceDoes the study measure clinical significance (e.g., impact on patient management, health outcomes)?NoDoes not provide clinical benefit data (reports pure performance metrics only or descriptive).0
CRIT7Statistical analysisIs there a statistical analysis?YesStatistical comparisons are made (e.g., between groups, p-values, confidence intervals).1
CRIT7Statistical analysisIs there a statistical analysis?NoNo statistical comparison (descriptive data only).0

All included datasets are appraised for their methodological quality and scientific validity (CRIT3 + CRIT4 + CRIT7, scored 0-4) and clinical relevance (CRIT1 + CRIT2 + CRIT5 + CRIT6, scored 0-6). The weight of each dataset is measured by the total score obtained (0-10).

Inclusion threshold and rationale​

Articles with a total CRIT1-7 score strictly greater than 4 are included in the State of the Art evaluation (R-TF-015-011 State of the Art) and the per-publication appraisal results are also documented there. The 4-out-of-10 threshold is a deliberate calibration point: it requires that an included publication demonstrate at least an aggregate of methodological adequacy and topical relevance to the device's intended use, while not being so restrictive that legitimate field-defining publications with one weak criterion (for example, a study with sound design and statistics but lower clinical-significance reporting, or a recent guideline summary with limited statistical content) are mechanically excluded.

A sensitivity analysis is performed at thresholds of 3, 4 (primary), and 5 to confirm that the conclusions of the SotA evaluation are robust to small changes in the threshold. The sensitivity analysis has been performed and documented in R-TF-015-011, and its results confirm that the SotA conclusions are unchanged at thresholds 3 and 5 relative to the primary threshold of 4. Where a publication scores ≤ 4 but is judged essential to the SotA narrative on substantive grounds (for example, a regulatory guideline document that does not generate quantitative outcomes but defines the standard of care), the publication is included with an explicit per-publication justification recorded in the appraisal table. The exception is bounded by the following rule: at any given refresh of R-TF-015-011, no more than five publications may be carried below threshold; each such inclusion is flagged in the appraisal table with its own justification, the total count is reported in the CER alongside the threshold sensitivity analysis, and any below-threshold inclusion is re-reviewed at the next annual CER update.

Clinical Development Plan​

The Clinical Development Plan (CDP) indicates progression from exploratory investigations, such as first-in-man studies and pilot studies, to confirmatory investigations, such as pivotal clinical investigations and a PMCF with an indication of milestones and a description of potential acceptance criteria.

Purpose​

To establish the roadmap of clinical evidence required for the device, ensuring compliance with the General Safety and Performance Requirements (GSPR).

Phased progression of the clinical evaluation​

The evidence strategy adopted by this plan (MDR Annex XIV Part A §1(a) — "clinical development plan indicating progression from exploratory investigations, such as first-in-man studies, feasibility and pilot studies, to confirmatory investigations, such as pivotal clinical investigations, and a PMCF") organises every source into one of six phases. Phase 0 captures the non-clinical prerequisites that enable the three-pillar framework: the systematic state-of-the-art review (R-TF-015-011) anchors the Pillar 1 Valid Clinical Association; the AI model verification-and-validation (R-TF-028-005) and software V&V (R-TF-012-038) anchor the Pillar 2 Technical Performance claims; usability engineering (R-TF-025-004/005/006/007) discharges GSPR 5 and IEC 62366-1 (a prerequisite to the validity of the confirmatory investigations, not a pillar contribution in itself). Phase 2 covers the pre-market confirmatory clinical investigations (prospective real-patient pivotal studies) that constitute the primary Pillar 3 Clinical Performance evidence. Phase 3 covers the MRMC simulated-use reader studies that contribute Rank 11 Pillar 3 §4.4 supporting evidence. Phase 4 captures the equivalence route by which the legacy predecessor's pivotal, passive PMS and post-market observational data (R-TF-015-012) enter the device's clinical evaluation under MDR Article 61(5)-(6) and MDCG 2020-5. Phase 5 is the post-market clinical follow-up programme (R-TF-007-002), which confirms — rather than fills — the pre-market evidence base per MDCG 2020-6 §6.4. Phase 6 captures the peer-reviewed publications that contribute Pillar 2 (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022 — severity-algorithm validation against expert consensus) and Pillar 3 (NMSC_2025 — malignancy detection in a specialist clinic population) evidence in support of the three-pillar causal chain.

Phase 1 (exploratory / first-in-man / feasibility) is intentionally not populated for this device: exploratory clinical evaluation is discharged via (i) the Phase 0 non-clinical verification-and-validation of the AI model against the curated labelled image database (R-TF-028-005), (ii) the Phase 6 peer-reviewed severity-algorithm validation literature on the same algorithm family, and (iii) the Phase 4 equivalence-derived clinical experience of the legacy predecessor — consistent with MDCG 2020-1 §4.3/§4.4 and MDR Article 61(6)(b) on the use of pre-existing clinical data for software devices whose exploratory phase is adequately addressed by analytical and equivalence routes.

The Phase 3 MRMC studies are explicitly positioned as supporting Pillar 3 evidence layered onto the Phase 2 primary Pillar 3 evidence, consistent with MDCG 2020-1 §4.4. The Phase 1 (exploratory / first-in-man) rationale for non-population is given in the preceding paragraph.

Legend note: the Phase grouping in the diagram above denotes evidence genre and MDCG 2020-1 pillar mapping; the MDCG 2020-6 Appendix III evidence rank is an independent, orthogonal axis that is NOT encoded in the diagram's node colours. Rank is stated per study in the "Planned evidence classification per study" table below. A node's Phase (and colour) should therefore not be read as implying a specific rank — for example, a Phase 2 node may be Rank 2 or Rank 4, and a Phase 4 node may be Rank 5, Rank 7 or Rank 8. Pillar, Phase and Rank are three independent dimensions and the table below is authoritative for each source's per-dimension classification.

Current State of the Evidence​

Concise summary of available evidence justifying the initiation of the CDP phases.

Non-clinical test results: bench testing​

The technical performance of the device aims to demonstrate that the MDSW's ability to accurately, reliably and precisely generate the intended output (assessment of all diseases of the skin incorporating conditions affecting the epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis)), from the input data (images of visible skin structure abnormalities).

The clinical evaluation will take into consideration the relevant pre-clinical and performance data listed below.

Software Design Verification​

Design Verification and Validation confirm the device's compliance with Design Requirements under real and simulated conditions in accordance with the principles of IEC 62304:2006 + Amd 1:2015 (Medical device software: Software life cycle processes) and IEC 82304-1:2016 (Health software: Part 1: General requirements for product safety). The complete results and traceability of all Verification activities are documented in the Report, reference R-TF-012-038.

AI Model Validation and Testing​

The Artificial Intelligence (AI) model, which constitutes the core component of the device, has undergone a dedicated validation and testing process to ensure its accuracy, robustness, reliability, and cybersecurity in line with its intended medical purpose.

This validation was performed in alignment with state-of-the-art principles for AI/ML medical software. The principal regulatory basis for the conformity of the device is the Medical Device Regulation (EU) 2017/745. We additionally note voluntary alignment with the principles set out in Regulation (EU) 2024/1689 (Artificial Intelligence Act) for high-risk AI systems, where applicable to the device's lifecycle, in preparation for the AI Act's staged transitional periods; this voluntary alignment does not constitute the conformity basis for CE marking under MDR. The process adheres to the software lifecycle requirements of IEC 62304:2006 + Amd 1:2015 and is further guided by the principles of Good Machine Learning Practice (GMLP).

The validation process included a rigorous assessment of the training, validation, and testing datasets; performance testing on an independent (held-out) test dataset to evaluate key metrics (e.g., sensitivity, specificity, accuracy); and an analysis of the model's generalisability and robustness against potential biases. The complete AI Model Validation and Testing protocol, results, and traceability are documented in the R-TF-028-005 AI Development Report.

Usability Engineering​

The formative evaluation of the device's usability has been completed in accordance with the principles of IEC 62366-1:2015 (Application of usability engineering to medical devices) and is documented in the Formative Evaluation Report (R-TF-012-008).

The summative usability validation was conducted in October 2025 as a standalone human factors validation study in accordance with IEC 62366-1:2015 §5.9 and the FDA Final Guidance on Applying Human Factors and Usability Engineering to Medical Devices (February 2016). The decision to conduct the summative evaluation as an independent study rather than as an integral part of a clinical investigation was taken to ensure that usability endpoints were evaluated under controlled conditions with dedicated test scenarios for each intended user group (HCP and ITP), without confounding by clinical investigation variables. This approach is consistent with IEC 62366-1:2015 §5.9, which does not require summative evaluation to be embedded within a clinical investigation. The summative evaluation protocol, observation forms, questionnaires, and results are documented in R-TF-025-004, R-TF-025-005, R-TF-025-006, and R-TF-025-007 respectively.

The evidence gathered from this complete usability engineering process has been incorporated into the Clinical Evaluation Report. This data demonstrates that risks associated with use error have been minimized and that the device can be used safely and effectively by the intended users, thereby supporting compliance with the relevant General Safety and Performance Requirements (GSPRs).

Existing clinical data​

The systematic literature review, documented in the State of the Art report (R-TF-015-011 State of the Art), concluded that while existing data supports the clinical background and the general state of the art, it provides insufficient direct evidence to fully confirm the clinical performance, safety, and benefits of the device itself. A gap was identified in demonstrating the device's performance in a real-world clinical environment with its intended user population.

To address this evidence gap and to generate the necessary clinical data, pivotal investigations were designed and conducted. The primary objective of these studies was to definitively evaluate the device's clinical safety and performance and to provide robust evidence supporting its intended clinical benefits, thereby demonstrating conformity with the relevant General Safety and Performance Requirements (GSPRs).

Accordingly, the following pivotal investigations have been conducted (see table in Confirmatory phase below).

The pre-market pivotal investigations address the controlled-study dimension of the identified gap. The routine-clinical-practice dimension is addressed through the equivalent legacy device's post-market surveillance programme, specifically via a protocolled cross-sectional observational study conducted under MDR Article 83 (R-TF-015-012), whose Protocol is nested inside the legacy umbrella PMS Plan (R-TF-007-005). This study is the instrument by which the plan closes the real-world-use gap acknowledged above, consistent with MDCG 2020-6 §6.2.2 (use of post-market data from an equivalent legacy device as clinical evidence) and MDCG 2020-6 §6.5.e (gap-bridging through "clinically relevant scientifically sound questionnaires"). Its quantitative outcomes and Likert professional-opinion items are both classified at Rank 8 by primary classification per MDCG 2020-6 Appendix III (proactive PMS data), with a supplementary case for Rank 4 classification of the quantitative endpoints presented in "Summary of the Combined Strategy". The study's results are consolidated in the legacy umbrella PMS Report (R-TF-007-003) and presented in the CER (R-TF-015-003) as post-market clinical evidence per §6.2.2.

Clinical evidence assessment strategy​

This section describes the planned regulatory framework, evidence quality hierarchy, and assessment methodology for evaluating the clinical evidence portfolio. It establishes the basis for the study design rationale and the expected evidence classification of each pivotal investigation.

Regulatory framework and applicable guidance​

Per MDR Article 61(1), the manufacturer shall specify and justify the level of clinical evidence necessary to demonstrate conformity with the relevant general safety and performance requirements (GSPRs). That level of clinical evidence shall be appropriate in view of the characteristics of the device and its intended purpose.

Regulatory pathway for the device and Article 120 status of the legacy device​

The clinical evaluation of the device is conducted under the following framing, confirmed as applicable at the date of this plan:

  • The device is a new Class IIb MDR device undergoing first CE-marking under Regulation (EU) 2017/745. Article 120 does not apply to the device itself, as the device has never held an MDD certificate.
  • The legacy predecessor device (CE-marked Class I under MDD since 2020) remains on the market under MDR Article 120(3) transition provisions. The legacy predecessor has undergone no significant changes since MDD CE-marking; a standing significant-change assessment per MDCG 2020-3 is held as a QMS record.
  • All differences between the device and the legacy predecessor are device-only deployments — no changes have been pushed to the legacy predecessor. The device diverges from a fixed legacy reference point by a specifiable set of additions (see the equivalence section).
  • Clinical data from the legacy predecessor — including the pivotal investigations conducted with the legacy predecessor, its passive post-market surveillance corpus, and the post-market observational study R-TF-015-012 — enters the device's clinical evaluation via the MDCG 2020-5 equivalence framework and MDR Article 61(5)-(6). The equivalence demonstration is the load-bearing route for this transfer.

MDR Article 83 framing for the legacy predecessor's post-market observational study​

The post-market observational study R-TF-015-012 is executed under the manufacturer's standing MDR Article 83 post-market surveillance obligation applicable to the legacy predecessor via MDR Article 120(3). The study protocol, dated 7 November 2025, is part of the legacy predecessor's 2025–2026 Article 83 post-market surveillance cycle; it is scoped in the legacy umbrella PMS Plan (R-TF-007-005) and is executed on its pre-declared schedule, with outcomes consolidated in the legacy umbrella PMS Report (R-TF-007-003). The protocol's pre-specified endpoints, MCIDs derived from published SotA, Holm-Bonferroni multiplicity correction for three co-primary endpoints, pre-specified data-source sensitivity analysis, and integrated safety-data collection reflect the manufacturer's ongoing proactive post-market surveillance obligation under Article 83. The study's outcomes enter the device's clinical evaluation via MDCG 2020-5 equivalence and MDR Article 61(5)-(6) at Rank 8 (primary classification, applied to both quantitative endpoints and Likert professional-opinion items) per MDCG 2020-6 Appendix III. A supplementary case for Rank 4 classification of the quantitative endpoints, under the Appendix III "high quality surveys may also fall into this category" note, is presented in "Summary of the Combined Strategy" but does not alter the Pillar 3 sufficiency determination.

Guidance framework applied​

The clinical evaluation follows a combined methodology drawing on the following guidance documents:

  • MDCG 2020-1 (Guidance on clinical evaluation of medical device software) — primary guidance for the clinical evaluation of the device, defining the three-pillar evidence framework (Valid Clinical Association, Technical Performance, Clinical Performance) specifically applicable to medical device software (MDSW) and harmonised with IMDRF/SaMD WG/N41FINAL:2017.
  • MEDDEV 2.7.1 Rev 4 (Clinical evaluation: a guide for manufacturers and notified bodies) — process template for the clinical evaluation, structured as Stages 0 through 4: from scoping (Stage 0) through identification of pertinent data (Stage 1), appraisal of pertinent data (Stage 2), analysis of clinical data (Stage 3), to the ongoing clinical evaluation through post-market surveillance (Stage 4). Sections 6.4, 8, 9, 10, and Annexes A3, A4, A5, A6, A7.2, A7.3, A7.4, and A10 remain applicable under MDR per MDCG 2020-6 Appendix I. Where MEDDEV Section 10 references MDD Essential Requirements, these are substituted with the corresponding MDR GSPRs.
  • MDCG 2020-5 (Clinical evaluation — equivalence) — equivalence framework for the demonstration of equivalence between the device and the legacy predecessor, carried out against technical, biological and clinical characteristics per §A2.1, and used together with MDR Article 61(5)-(6) to bring the legacy predecessor's clinical data into scope for the device.
  • MDCG 2020-6 (Clinical evidence needed for medical devices previously CE marked under Directives 93/42/EEC or 90/385/EEC) — referenced for (i) its Appendix III evidence-rank hierarchy used to tier every source in the evidence portfolio, and (ii) the appraisal of the legacy predecessor's post-market surveillance data within the equivalence context. MDCG 2020-6 is not the primary guidance for the device's clinical evaluation, because the device is a new MDR device rather than a legacy device transitioning to MDR.
  • MDCG 2020-3 (Significant change assessment) — referenced only in the context of the legacy predecessor's Article 120(3) status (standing QMS record confirming no significant changes to the legacy predecessor). MDCG 2020-3 does not apply to the classification of the device, because the device is a new device rather than a reclassified legacy.

Evidence quality hierarchy (MDCG 2020-6 Appendix III)​

MDCG 2020-6 Appendix III establishes a 12-level hierarchy of clinical evidence, ranked from strongest (Rank 1) to weakest (Rank 12). For this Class IIb MDSW, sufficiency of clinical evidence is determined per MDCG 2020-6 §6.4 by combined weight-of-evidence closure of each MDCG 2020-1 three-pillar requirement (Pillar 1 Valid Clinical Association, Pillar 2 Technical / Analytical Performance, Pillar 3 Clinical Performance), with evidence appropriate to the risk of the device and its intended purpose per MDR Article 61(1). No single minimum-rank threshold applies to Class IIb MDSW under this framework; sufficiency is the cumulative achievement of the three-pillar closure across the full evidence portfolio. For the device under evaluation — a Class IIb MDSW with a melanoma indication (ICD-11 2C30) that drives clinical management — the manufacturer additionally adopts Rank 4 as an internal floor for pivotal real-patient evidence on the malignancy indication, driven by the IMDRF 5.2.1 melanoma Critical cell and by the GSPR 1 / 8 / 17 risk profile for the malignancy sub-benefit 7GH(c). This internal floor is a self-imposed engineering commitment, not the regulatory threshold. The clinical evidence portfolio is planned to include evidence at the following ranks:

  • Rank 2 (high-quality clinical investigations with some gaps): prospective studies conducted in real clinical settings with protocol-driven methodology, ethics committee approval, and formal clinical investigation plans, where the study itself is methodologically sound but does not cover all indications or populations. Gaps must be justified / addressed with other evidence in line with an appropriate risk assessment, and clinical safety, performance, benefit and device claims. Assuming the gaps can be justified, there should be an appropriate PMCF plan to address residual risks.
  • Rank 4 (studies with methodological limitations but data still quantifiable): applies to pre-market investigations with design constraints that limit the evaluability of one or more endpoints, but where quantitative performance data remains extractable and clinically meaningful (retrospective arm of IDEI_2023, DAO_Derivación_PH_2022, NMSC_2025). The quantitative endpoints of the legacy-device post-market observational study R-TF-015-012 are carried at Rank 8 by primary classification (see Rank 8 bullet below); a supplementary Rank 4 case for those quantitative endpoints, under the Appendix III "high quality surveys may also fall into this category" note, is presented in "Summary of the Combined Strategy" and does not alter the Pillar 3 sufficiency determination.
  • Rank 7 (complaints and vigilance data; curated quality management system data): the legacy device's passive post-market surveillance data, consolidated in the legacy umbrella PMS Report (R-TF-007-003) — the Report paired with the legacy umbrella PMS Plan (R-TF-007-005) — which constitutes clinical data per MDR Article 2(48). Its contribution is Safety confirmation cross-cutting the MDCG 2020-1 pillars rather than primary clinical-performance evidence; Rank and Pillar are orthogonal and are mapped separately in the per-study classification table below.
  • Rank 8 (proactive PMS data, such as surveys and professional opinion): the post-market cross-sectional observational study of the equivalent legacy device (R-TF-015-012) is classified at Rank 8 as its primary classification, applied to both the quantitative endpoints and the Likert professional-opinion items (questionnaire items B1, B3, B5, C1-C3, D1, D3, D5, E1, F3 captured across 21 client institutions). Rank 8 is the conservative reading of the study's cross-sectional, physician-recall design under Appendix III. A supplementary case for Rank 4 classification of the quantitative endpoints is presented in "Summary of the Combined Strategy" but the primary classification is Rank 8. The study is complementary to — not a substitute for — the pre-market Pillar 3 evidence from Route C, consistent with MDCG 2020-6 §6.2.2.
  • Rank 11 (simulated use testing with healthcare professionals): multi-reader multi-case (MRMC) studies using clinical images in a simulated assessment environment. These are not "clinical data" under the strict MDR Article 2(48) definition (no live-patient data collection) and therefore sit below the prospective clinical studies in the evidence hierarchy. Per MDCG 2020-1 §4.4 they contribute to Pillar 3 Clinical Performance because they demonstrate that intended users achieve clinically relevant outputs on images representative of the intended patient population.

Three-pillar evidence framework for MDSW (MDCG 2020-1)​

MDCG 2020-1 establishes three evidence pillars that must be addressed for medical device software. These pillars are parallel requirements, each satisfied by different types of evidence:

  • Valid Clinical Association (VCA): The software's output correlates with a real clinical condition, accepted by the medical community and described in peer-reviewed literature. Each specific claimed output requires separate VCA establishment. Planned source: systematic literature review (R-TF-015-011 State of the Art), reinforced by the targeted surrogate-endpoint validity anchoring layer documented in R-TF-015-011 §"Surrogate endpoint validity", which establishes that the three surrogate-endpoint families underlying benefits 7GH, 5RB and 3KX are accepted proxies for patient-relevant outcomes in peer-reviewed dermatology and regulator-accepted clinical-endpoint history.
  • Technical Performance: The software reliably and accurately generates its intended outputs from its inputs, across the full range of real-world input variability. Pillar 2 carries the device's stand-alone analytical claim: the device correctly classifies across all 346 ICD-11 dermatological categories at its stand-alone output, measured on curated input-output pairs without a clinician in the loop. This is an analytical-performance claim about the device's stand-alone output, distinct from the clinician-in-the-loop Pillar 3 claim measured on the Top-5 prioritised differential view. Planned source: AI model verification and validation documented in R-TF-028-005 AI Development Report; algorithm validation against the manufacturer's curated labelled image database (the proprietary dermatology training and test dataset, documented in R-TF-028-003 Data Collection Instructions); and the published severity validation literature (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022), which assesses the device's severity scoring outputs against independent expert dermatologist consensus.
  • Clinical Performance: The software produces clinically relevant outputs when used in the real intended-use context by the intended users on the intended patient population. Pillar 3 is measured on the device's mandated integration outputs — the Top-5 prioritised differential view, the malignancy-prioritisation gauge and the referral recommendation — not on the device's 346-category stand-alone output. The device is validated only when integrated against these mandated user-interface-presentation requirements (see "Integrator integration requirements as risk controls" under Intended clinical benefits). Planned source: prospective clinical studies conducted in real clinical settings (MC_EVCDAO_2019, IDEI_2023, COVIDX_EVCDAO_2022, DAO_Derivación_O_2022, DAO_Derivación_PH_2022); the post-market cross-sectional observational study of the equivalent legacy device (R-TF-015-012) contributing routine-practice clinical-performance evidence per MDCG 2020-6 §6.2.2; the legacy device's passive PMS data; and the MRMC simulated-use reader studies (BI_2024, PH_2024, SAN_2024, MAN_2025), which provide supporting Pillar 3 evidence at Rank 11 by demonstrating that intended users achieve clinically relevant outputs on images representative of the intended patient population (MDCG 2020-1 §4.4).

The MRMC studies are not "clinical data" under the strict MDR Article 2(48) definition (no live-patient data collection) and therefore sit at a lower evidence rank than the prospective real-patient studies. They nevertheless contribute to Pillar 3 Clinical Performance because what they measure — intended users (HCPs) achieving clinically relevant outputs when using the device — matches the MDCG 2020-1 §4.4 definition of clinical performance. The prospective studies, the R-TF-015-012 post-market observational study, and the legacy PMS data provide the primary Pillar 3 evidence; the MRMC studies provide supporting Pillar 3 evidence.

Planned tiered evidence assessment​

The clinical evaluation adopts a risk-proportionate, tiered evidence structure for the analysis of clinical data (MEDDEV Stage 3):

  • Tier 1 (Malignant conditions, individual analysis): The clinical consequence of misclassification is highest — delayed cancer diagnosis can lead to disease progression and mortality. Performance is assessed with individual acceptance criteria per condition or condition group. Primary evidence: MC_EVCDAO_2019 (melanoma), plus malignancy prediction endpoints across further studies, such as IDEI_2023.
  • Tier 2 (Rare diseases, grouped analysis): Rare diseases are frequently misdiagnosed; delayed diagnosis leads to prolonged suffering and inappropriate treatment. Performance for the rare-disease subgroup is framed as a surfacing claim — the device's pre-market rare-disease Pillar 3 evidence supports the presence of the correct low-prevalence ICD-11 category in the Top-5 prioritised differential view (rather than a Top-1 standalone-accuracy claim). This scoping keeps the CE-marking claim within what the pre-market evidence base can genuinely support. The evidence base for this tier is layered, with Rank appropriate to each contribution: MDCG 2020-1 Pillar 1 literature evidence (Valid Clinical Association for AI-assisted image-based recognition of rare dermatoses); MDCG 2020-1 Pillar 2 algorithm performance metrics extracted from the curated labelling dataset for rare-disease presentations; Rank 7 legacy post-market surveillance data for rare-disease presentations via MDCG 2020-5 equivalence; the BI_2024 and PH_2024 MRMC simulated-use reader studies contributing Rank 11 Pillar 3 §4.4 supporting evidence — demonstrating that intended users (HCPs) achieve clinically relevant outputs on images representative of the rare-disease subgroup, while explicitly not being "clinical data" under MDR Article 2(48); and a pre-specified PMCF activity with enrolment targets and diagnostic-accuracy thresholds. Because real-patient prospective recruitment at sufficient volume is impractical for these very-low-prevalence conditions, per-ICD-11-category Pillar 3 real-patient clinical-performance evidence is, for a subset of the rare-disease categories in scope, an acceptable evidence gap declared per MDCG 2020-6 §6.5(e): the declaration (a) enumerates the rare-disease ICD-11 categories for which pre-market real-patient Pillar 3 evidence is absent, (b) scopes the CE-marking claim for those categories to the Top-5 surfacing claim above, and (c) specifies the PMCF confirmation activity (PMCF Activity D/E/F in R-TF-007-002, with pre-specified accrual triggers and a pre-specified re-opening condition if post-market accrual falls persistently below threshold). The §6.5(e) declaration and the per-ICD-11-category enumeration are restated in the CER (R-TF-015-003, section "Rare-disease indication scoping and MDCG 2020-6 §6.5(e) declaration"). The IFU and Intended Purpose language for the rare-disease subgroup are aligned with this scoping — no Top-1 standalone-accuracy claim is made for the rare-disease subgroup from pre-market evidence alone.
  • Tier 3 (General conditions, pooled with risk-based justification): For non-malignant, non-rare conditions, the clinical consequence of an incorrect ranking is comparable — delayed or modified treatment, not mortality. Performance is assessed as a pooled aggregate with explicit risk-based justification documented in the CER. Primary evidence: the prospective real-patient clinical investigations (Ranks 2 and 4); supported by MRMC Rank 11 Pillar 3 §4.4 evidence where applicable to a specific sub-indication. For Fitzpatrick V–VI coverage specifically, the CEP declares a pairing of pre-market and post-market evidence: MAN_2025 (Pillar 3 §4.4 at Rank 11; pre-market supporting evidence) paired with PMCF Activity F in R-TF-007-002 (post-market confirmation, age-stratified and phototype-stratified Top-N performance tracking with pre-specified thresholds). MAN_2025 and PMCF Activity F together constitute the §3.4 representativeness response: a pre-market supporting signal at Rank 11 followed by post-market confirmation of real-world Fitzpatrick V–VI performance. Pre-market Pillar 3 Ranks 2–4 evidence stratified on Fitzpatrick V–VI is not generated within this Plan, because real-patient prospective recruitment at sufficient phototype V–VI volume is impractical in the pre-market window; MAN_2025 at Rank 11 carries the pre-market signal as supporting evidence and is explicitly not a substitute for Ranks 2–4 pre-market evidence. The pre-market Pillar 3 Ranks 2–4 gap stratified on Fitzpatrick V–VI is therefore declared an acceptable evidence gap under MDCG 2020-6 §6.5(e), discharged post-market by PMCF Activity F under pre-specified phototype-stratified Top-N performance thresholds and a pre-specified re-opening condition if post-market accrual falls persistently below the threshold. The §6.5(e) declaration is restated in the CER (R-TF-015-003, section "Representativeness of the Study Populations").

This tiered structure ensures that evidence assessment is proportionate to clinical risk: high-risk conditions receive individual scrutiny, while lower-risk conditions are validly pooled with documented justification per MDCG 2020-6 Appendix III.

Planned paediatric subgroup analysis​

Where study data permit, exploratory subgroup analyses stratified by age group (infant, child, and adult) are planned for the MRMC pivotal investigations (BI_2024 and PH_2024), in which the image sets include paediatric cases. These analyses compare device-assisted versus unaided diagnostic accuracy for each age subgroup and are reported in the respective Clinical Investigation Reports. The analyses are exploratory — sample sizes within each paediatric subgroup are not sufficient to power confirmatory statistics — and are documented in the CER together with the overall per-study results.

Discharge of the GSPR obligation for the paediatric subpopulation​

The intended use of the device covers all age groups, including paediatric patients (under 18 years of age, per Regulation (EC) No 1901/2006). The MDR Annex XIV §1(a) and GSPR 1 obligation for clinical performance evidence in this subpopulation is discharged through a multi-source approach that does not rely on a single dedicated paediatric clinical investigation pre-CE-marking:

  • (i) Algorithm-level VCA — the deep-learning classifier was trained on a curated dataset that includes paediatric presentations, documented in R-TF-028-003 Data Collection Instructions; the same model architecture and weights are applied for all age groups (no separate paediatric algorithm path).
  • (ii) Pillar 2 Technical Performance — the published severity-validation literature (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022) includes paediatric cases within its image sets; algorithm-level performance generalises across age groups.
  • (iii) Pre-market exploratory paediatric subgroup analysis — BI_2024 child subgroup (2-12 years) shows +11.27 pp accuracy improvement; PH_2024 infant subgroup (1 month-2 years) shows +33.33 pp; PH_2024 child subgroup shows +11.11 pp. Reported in the CER for transparency.
  • (iv) Equivalence-derived legacy PMS — the legacy predecessor's commercial deployment includes paediatric cases within the ≈ 250,000-report denominator with zero serious incidents (rule-of-three upper one-sided 95% bound applied per the Safety Benchmarking section of the CER).
  • (v) PMCF Activity F.1 — pre-specified paediatric case-proportion monitoring and age-stratified Top-N performance tracking (0-2, 2-12, 12-18 years) within Activities C.1 and C.2 in R-TF-007-002, with an unscheduled CER update triggered if Top-3 accuracy in any paediatric band falls below the overall cohort by more than the per-band threshold defined in R-TF-007-002.

The paediatric coverage is declared as an acceptable evidence gap per MDCG 2020-6 §6.5(e) at the pre-market stage on the basis of (a) field-wide SotA limitation (no AI-dermatology paediatric-dedicated study identified in the supplementary literature search), (b) the device's role as decision support (not autonomous diagnosis) for an HCP user, and (c) the multi-source pre-market evidence above plus the committed PMCF F.1 activity. The §6.5(e) declaration is made in the CER's "Pediatric population" subsection.

The paediatric population is defined as the segment of the population under 18 years of age, in accordance with the European criteria established by Regulation (EC) No 1901/2006 of the European Parliament and of the Council of 12 December 2006 on medicinal products for paediatric use and amending Regulation (EEC) No 1768/92, Directive 2001/20/EC, Directive 2001/83/EC and Regulation (EC) No 726/2004, available at https://eur-lex.europa.eu/eli/reg/2006/1901/oj/eng.

Planned evidence classification per study​

StudyDesignPlanned MDCG 2020-6 RankMDCG 2020-1 PillarEvidence tier
MC_EVCDAO_2019Prospective analytical observationalRank 2Clinical PerformanceTier 1 (melanoma) + Tier 3
COVIDX_EVCDAO_2022Prospective observational longitudinalRank 2Clinical PerformanceTier 3
DAO_Derivación_O_2022Prospective observational (referral pathway)Rank 2Clinical PerformanceTier 1 + Tier 3
IDEI_2023Prospective + retrospectiveRank 2 (prospective) - 4 (retrospective)Clinical PerformanceTier 1 (malignancy) + Tier 3
DAO_Derivación_PH_2022Prospective analytical observationalRank 4Clinical PerformanceTier 1 + Tier 3
BI_2024Prospective MRMCRank 11Clinical Performance (MDCG 2020-1 §4.4; supporting Pillar 3 evidence at Rank 11)Tier 2 (rare diseases) + Tier 3
PH_2024Prospective MRMCRank 11Clinical Performance (MDCG 2020-1 §4.4; supporting Pillar 3 evidence at Rank 11)Tier 3
SAN_2024Prospective MRMCRank 11Clinical Performance (MDCG 2020-1 §4.4; supporting Pillar 3 evidence at Rank 11)Tier 3
MAN_2025Prospective MRMCRank 11Clinical Performance (MDCG 2020-1 §4.4; supporting Pillar 3 evidence at Rank 11)Tier 3 (Fitzpatrick V-VI)
AIHS4_2025Retrospective observational (proof-of-concept / pilot feasibility study, 2 patients, 16 assessments)Rank 9 (individual case reports on the subject device) per MDCG 2020-6 Appendix III: the study's n = 2 / 16 assessments profile does not support a CE-marking severity claim on its own and is carried as hypothesis-generating pilot feasibility data for Benefit 5RBClinical Performance — early-stage feasibility (proof-of-concept dataset: insufficient sample size and design to support a CE-marking severity claim on its own; carried as hypothesis-generating data for Benefit 5RB per MDCG 2020-6 Appendix III feasibility framing)Severity assessment only (pilot)
Legacy PMS data (passive)Vigilance and curated QMS dataRank 7Safety confirmation — cross-cut to the three MDCG 2020-1 MDSW pillars (MDR Article 61(1); Annex I §§1, 3, 4, 8; MDCG 2020-6 §§6.1 and 6.3; MEDDEV 2.7/1 Rev 4 §§A7.2 and A7.4; ISO 14971:2019 §§7, 8, 10; MDR Articles 83, 86, 87, 88; see §Definitions)All tiers
R-TF-015-012 (legacy device cross-sectional observational study — Protocol; nested inside the legacy umbrella PMS Plan R-TF-007-005; conclusions consolidated in the legacy umbrella PMS Report R-TF-007-003)Cross-sectional observational with retrospective recall; physician-reported outcomes; formal protocol with pre-specified endpoints, MCIDs, SotA comparators, Holm-Bonferroni correction, sensitivity analysisRank 8 (primary classification, applied to both quantitative endpoints and Likert professional-opinion items) per MDCG 2020-6 Appendix III (proactive PMS data). A supplementary case for Rank 4 classification of the quantitative endpoints under the Appendix III "high quality surveys may also fall into this category" note is presented in "Summary of the Combined Strategy → Rank-8 primary classification of R-TF-015-012, with supplementary Rank-4 case"; the Pillar 3 sufficiency determination is unchanged under either classificationPillar 3 Clinical Performance (post-market real-world confirmation under MDCG 2020-6 §6.2.2; not the primary Pillar 3 load, which is carried by Route C Ranks 2-4); also contributes Safety confirmation via Section F pre-specified safety items F1-F4 (see §Definitions)All tiers
APASI_2025Peer-reviewed severity-validation publication (PASI; algorithm output vs. expert consensus; retrospective non-comparative with MINORS adaptation)Rank 6 (evaluation of state of the art, including clinical data from similar devices — here, peer-reviewed evidence on the device's own severity-scoring algorithm) with MINORS ≥12 methodological-quality appraisalTechnical Performance (Pillar 2)Severity assessment only
AUAS_2023Peer-reviewed severity-validation publication (UAS; algorithm output vs. expert consensus; retrospective non-comparative with MINORS adaptation)Rank 6 with MINORS ≥12 methodological-quality appraisalTechnical Performance (Pillar 2)Severity assessment only
AIHS4_2023Peer-reviewed severity-validation publication (IHS4; algorithm output vs. expert consensus; retrospective non-comparative with MINORS adaptation)Rank 6 with MINORS ≥12 methodological-quality appraisalTechnical Performance (Pillar 2)Severity assessment only
ASCORAD_2022Peer-reviewed severity-validation publication (SCORAD; algorithm output vs. expert consensus; retrospective non-comparative with MINORS adaptation)Rank 6 with MINORS ≥12 methodological-quality appraisalTechnical Performance (Pillar 2)Severity assessment only
NMSC_2025Peer-reviewed clinical-performance publication (BCC / cSCC detection in a specialist head-and-neck clinic setting; retrospective observational)Rank 4Clinical Performance (Pillar 3 Tier 1)Tier 1 (malignancy — BCC/cSCC)

The per-study appraisal, including detailed methodological quality assessment, key results, limitations, and acceptance criteria status, is presented in the Clinical Evaluation Report (R-TF-015-003, section "Per-study evidence appraisal").

Summary distribution of planned evidence across Rank and Pillar​

The planned classification above is summarised below as a cross-tabulation of MDCG 2020-6 Appendix III rank against MDCG 2020-1 pillar, together with a rank-ordered listing that includes every planned clinical investigation, every peer-reviewed publication cited in the clinical evaluation, and the post-market surveillance corpus of the equivalent legacy device. The same source data populates both views; the cross-tab provides the distributional summary, while the rank-ordered listing documents the Pillar, Tier, Phase, Benefit and Role assigned to each source.

The cross-tab below has four columns on the pillar axis. The first three — Pillar 1 Valid Clinical Association, Pillar 2 Technical / Analytical Performance, and Pillar 3 Clinical Performance — are the three MDSW evidence pillars of MDCG 2020-1. The fourth column is a cross-cut labelled "Safety confirmation" — it is not a fourth MDCG 2020-1 pillar. The cross-cut captures the evidentiary contribution of sources that pre-specify safety-relevant outcome collection and report those outcomes with denominators, and is anchored in MDR Article 61(1) and Annex I §§1, 3, 4 and 8; MDCG 2020-6 §§6.1 and 6.3; MEDDEV 2.7/1 Rev 4 §A7.2 and §A7.4; and ISO 14971:2019 §§7, 8 and 10. The full definition is given in the §Definitions entry for "Safety confirmation" above. Sources may contribute to more than one column: for example, the post-market observational study of the equivalent legacy device (R-TF-015-012) contributes to Pillar 3 Clinical Performance via its pre-specified quantitative endpoints (Rank 4) and professional-opinion Likert items (Rank 8), and to the Safety-confirmation cross-cut via the Section F safety items F1–F4 (Rank 8).

MDCG 2020-1 Pillar →
MDCG 2020-6 Rank ↓
Pillar 1: Valid Clinical AssociationPillar 2: Technical PerformancePillar 3: Clinical PerformanceSafety confirmation (cross-cut — not a fourth pillar)Row total
Rank 2: High-quality clinical investigations with some gaps——6—6
Rank 4: Studies with methodological limitations, data still quantifiable——14—14
Rank 5: Equivalence data (reliable / quantifiable)—41—5
Rank 6: Evaluation of state of the art1———1
Rank 7: Complaints and vigilance data; curated QMS data———11
Rank 8: Proactive PMS data (surveys / professional opinion)——112
Rank 11: Simulated-use testing with healthcare professionals (MRMC)——5—5
Column total1427234

Rank 2: High-quality clinical investigations with some gaps

SourceKindPillarTierPhaseBenefitRole
MC_EVCDAO_2019investigationPillar 3Tier 1: Malignancy (individual)P27GH 3KXprimary
MC_EVCDAO_2019investigationPillar 3Tier 3: General conditions (pooled)P27GHprimary
COVIDX_EVCDAO_2022investigationPillar 3Tier 3: General conditions (pooled)P27GH 3KXprimary
DAO_Derivation_O_2022investigationPillar 3Tier 1: Malignancy (individual)P27GH 3KXprimary
DAO_Derivation_O_2022investigationPillar 3Tier 3: General conditions (pooled)P23KXprimary
IDEI_2023investigationPillar 3Tier 1: Malignancy (individual)P27GH 3KXprimary

Rank 4: Studies with methodological limitations, data still quantifiable

SourceKindPillarTierPhaseBenefitRole
IDEI_2023investigationPillar 3Tier 3: General conditions (pooled)P27GH 5RBprimary
DAO_Derivación_PH_2022investigationPillar 3Tier 1: Malignancy (individual)P27GH 3KXprimary
DAO_Derivación_PH_2022investigationPillar 3Tier 3: General conditions (pooled)P27GHprimary
AIHS4 2025investigationPillar 3Severity assessmentP25RBsupporting
R-TF-015-012 — cross-sectional observational study of the equivalent legacy devicepms-corpusPillar 3All tiersP47GH 3KXsupporting
NMSC_2025publicationPillar 3Tier 1: Malignancy (individual)P67GHsupporting
triaje_VH_2025investigationPillar 3Tier 3: General conditions (pooled)P57GH 3KXsupporting
CVCSD_VC_2402investigationPillar 3Tier 3: General conditions (pooled)P57GH 3KXsupporting
clinical_VH_2025investigationPillar 3Tier 3: General conditions (pooled)P57GH 5RBsupporting
AFF_EVCDAO_2021investigationPillar 3Severity assessmentP55RBsupporting
acneinvestigationPillar 3Severity assessmentP55RBsupporting
aEASI_HVNinvestigationPillar 3Severity assessmentP55RBsupporting
AGM_2026investigationPillar 3Tier 3: General conditions (pooled)P57GHsupporting
PMCF-ICD-DXP-2026investigationPillar 3Tier 3: General conditions (pooled)P57GHsupporting

Rank 5: Equivalence data (reliable / quantifiable)

SourceKindPillarTierPhaseBenefitRole
Equivalence with legacy device (MDCG 2020-5)pms-corpusPillar 3All tiersP47GH 5RB 3KXsupporting
APASI_2025publicationPillar 2Severity assessmentP65RBprimary
AUAS_2023publicationPillar 2Severity assessmentP65RBprimary
AIHS4_2023publicationPillar 2Severity assessmentP65RBprimary
ASCORAD_2022publicationPillar 2Severity assessmentP65RBprimary

Rank 6: Evaluation of state of the art

SourceKindPillarTierPhaseBenefitRole
R-TF-015-011 — Systematic State of the Art review (Valid Clinical Association)pms-corpusPillar 1All tiersP0—primary

Rank 7: Complaints and vigilance data; curated QMS data

SourceKindPillarTierPhaseBenefitRole
Legacy predecessor passive PMS corpus (2020–present)pms-corpusSafetyAll tiersP4—supporting

Rank 8: Proactive PMS data (surveys / professional opinion)

SourceKindPillarTierPhaseBenefitRole
R-TF-015-012 — cross-sectional observational study of the equivalent legacy devicepms-corpusPillar 3All tiersP47GH 3KXsupporting
R-TF-015-012 — cross-sectional observational study of the equivalent legacy devicepms-corpusSafetyAll tiersP4—supporting

Rank 11: Simulated-use testing with healthcare professionals (MRMC)

SourceKindPillarTierPhaseBenefitRole
BI_2024investigationPillar 3Tier 2: Rare diseases (grouped)P37GHsupporting
BI_2024investigationPillar 3Tier 3: General conditions (pooled)P37GHsupporting
PH_2024investigationPillar 3Tier 3: General conditions (pooled)P37GH 3KXsupporting
SAN_2024investigationPillar 3Tier 3: General conditions (pooled)P37GH 3KXsupporting
MAN_2025investigationPillar 3Tier 3: General conditions (pooled)P37GHsupporting

Phase legend

  • P0: Non-clinical prerequisite
  • P1: Exploratory / proof-of-concept
  • P2: Pre-market confirmatory (real-patient pivotal)
  • P3: Pre-market supporting (MRMC simulated-use)
  • P4: Equivalence route (legacy PMS + observational)
  • P5: Post-market clinical follow-up (PMCF)
  • P6: Published literature supporting the three pillars
Reading the summary​

The MDCG 2020-6 Appendix III rank and the MDCG 2020-1 pillar are independent dimensions. A clinical investigation may contribute Pillar 3 Clinical Performance evidence at Rank 2 (prospective real-patient primary evidence) and a simulated-use reader study may contribute Pillar 3 Clinical Performance evidence at Rank 11 (MDCG 2020-1 §4.4 supporting evidence). The rank expresses the methodological strength of the source; the pillar expresses the evidentiary role of the source within the MDSW three-pillar framework. Both are required to appraise a source's fit to a given acceptance criterion.

Causal chain across the three pillars​

The Pillar 1 Valid Clinical Association evidence (systematic literature review, R-TF-015-011) anchors the plausibility of the Pillar 2 Technical Performance claims; the Pillar 2 analytical performance evidence — the device's stand-alone 346-ICD-11 analytical claim against curated input-output pairs, AI model verification-and-validation and the published severity-validation literature (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022) — establishes the device's outputs across the ICD-11 analytical space; the Pillar 3 Clinical Performance evidence (prospective real-patient pivotal investigations, MRMC simulated-use supporting studies, equivalence-derived legacy clinical data, and the post-market observational study R-TF-015-012) demonstrates that clinical users achieve clinically relevant outputs when using the device on the intended patient population. The Pillar 3 evidence is measured on the device's mandated integration outputs — the Top-5 prioritised differential view, the malignancy-prioritisation gauge and the referral recommendation, all specified in the Instructions for Use integration requirements and tracked as risk controls in R-TF-013-002 — and the device is validated only when integrated per those user-interface-presentation requirements. The device does not delegate user-interface-presentation responsibility to the integrator: the integrator is a co-controlled risk-control agent whose implementation of the mandated user-interface-presentation requirements is a precondition of the CE-marking clinical-benefit claim. The surrogate endpoints reported in the CER (diagnostic accuracy, referral optimisation, severity scoring) are positioned as clinically meaningful in the context of a clinical decision-support system per MEDDEV 2.7.1 Rev 4 §A7.2 and MDCG 2020-6 §6.4, anchored at pre-market by the surrogate-to-patient-outcome literature sub-stream declared under Route A (see "Route A: Systematic Literature Review", sub-stream 4) and documented in R-TF-015-011 section "Surrogate endpoint validity — by benefit domain", and the surrogate-to-patient-outcome link (diagnostic-accuracy improvement → correct clinical management decision → avoided delayed diagnosis, avoided unnecessary referral, objective severity monitoring) is continuously re-demonstrated across the PMCF programme. The overall assessment is articulated in the CER's "Assessment of the benefit-risk profile" section.

Published severity validation literature (MDCG 2020-1, Pillar 2)​

In addition to the pivotal clinical investigations, the literature search identified four peer-reviewed publications describing the validation of the device's severity assessment algorithms against expert dermatologist consensus on internationally validated clinical severity scales (PASI, SCORAD, IHS4, UAS). These publications — APASI_2025, AUAS_2023, AIHS4_2023, and ASCORAD_2022 — are classified as Technical Performance evidence per MDCG 2020-1 (Pillar 2), at Rank 6 of MDCG 2020-6 Appendix III (peer-reviewed evaluation of the device's own algorithms), with MINORS ≥12 layered as the methodological-quality appraisal. Their appraisal and analysis are documented in the CER (R-TF-015-003, section "Analysis of published severity validation studies"). This evidence establishes algorithm-level validity for the device's severity scoring across four conditions; prospective Pillar 3 Clinical Performance confirmation in the clinician-in-the-loop workflow is discharged via PMCF Activities B.1-B.5 in R-TF-007-002.

Published clinical performance literature (MDCG 2020-1, Pillar 3)​

In addition to the pivotal clinical investigations and published severity validation manuscripts, the literature search identified one peer-reviewed publication by the manufacturer reporting the clinical performance of the device's malignancy detection capability in a specialist head and neck clinic setting. This publication, NMSC_2025 (Medela et al., European Archives of Oto-Rhino-Laryngology, 2025; 135 patients; BCC/cSCC detection, head and neck clinic), is classified as Clinical Performance evidence per MDCG 2020-1, contributing to Pillar 3 of the three-pillar evidence framework, in the Tier 1 (malignant conditions) category. The study was conducted in a specialist clinic population where malignancy prevalence was substantially higher than in the intended-use primary care setting; this contextual difference is documented and accounted for in the per-study appraisal in the CER (R-TF-015-003, section "Per-study evidence appraisal").

Planned appraisal methodology for clinical investigations and published manuscripts​

For the manufacturer's own clinical investigations and published manuscripts identified in the literature search, design-specific validated tools are applied in accordance with MEDDEV 2.7.1 Rev 4 Section 9 (Stage 2 appraisal) and MDCG 2020-6 § 6.3. This approach differs from the CRIT1-7 framework used for the heterogeneous SotA corpus: the manufacturer's data sets are not scored against external relevance comparators but appraised for their internal methodological quality and appropriateness as evidence for the device under evaluation. Four study design families are present in the evidence portfolio, each appraised with the tool most appropriate to its design. Full appraisal tables and per-study interpretive commentary are documented in the Clinical Evaluation Report (R-TF-015-003), section "Validated methodological quality appraisal."

QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies): Applied to studies where the device functions as an index test evaluated against a clinical reference standard. QUADAS-2 assesses four domains — Patient Selection, Index Test, Reference Standard, and Flow and Timing — for both risk of bias and applicability concerns, with ratings of LOW, HIGH, or UNCLEAR. Applied to: MC_EVCDAO_2019, IDEI_2023, AIHS4_2025, and NMSC_2025. Pre-specified decision rules: a study with LOW ratings on all four domains for both bias and applicability is retained at full weight; a study with a single HIGH rating but with the HIGH concern mitigated by study design or supported by complementary evidence is retained at reduced weight (the reduction is narratively documented in the per-study appraisal); a study with two or more HIGH ratings across the four domains is downgraded to supporting evidence; UNCLEAR ratings are resolved by seeking clarification from the study record or, where unresolved, treated as HIGH for scoring purposes.

MINORS (Methodological Index for Non-Randomised Studies): Applied to non-randomised observational and comparative studies. MINORS scores each of 12 items from 0 (not reported) to 2 (reported and adequate), with non-comparative studies using items 1-8 (maximum 16) and comparative studies using all 12 items (maximum 24). Pre-specified decision rules: the thresholds of ≥12 (non-comparative) and ≥20 (comparative) are applied as soft advisory thresholds at this CEP, not as hard exclusion rules. A study scoring at or above threshold is retained at full weight. A study scoring 8–11 (non-comparative) or 15–19 (comparative) is retained at reduced weight with a narrative justification documented in the per-study appraisal. A study scoring below 8 (non-comparative) or below 15 (comparative) is excluded from the evidence base unless an explicit exception is recorded in the appraisal. MINORS is applied to three study design families:

  • Clinical utility and referral pathway studies (comparative, maximum 24): COVIDX_EVCDAO_2022, DAO_Derivación_O_2022, and DAO_Derivación_PH_2022, assessing whether the device delivers sustained clinical utility in real-world clinical settings.
  • MRMC simulated-use studies (comparative, maximum 24, with adaptation): BI_2024, PH_2024, SAN_2024, and MAN_2025. In MRMC designs, healthcare professionals evaluate a standardised pre-specified image set in a controlled environment, first without device assistance and then with device assistance; MINORS item 2 (consecutive patients) is adapted to assess the representativeness and completeness of the image set and HCP cohort rather than sequential patient enrolment. These studies provide supporting Pillar 3 Clinical Performance evidence per MDCG 2020-1 §4.4 (Rank 11 of the MDCG 2020-6 Appendix III hierarchy — simulated-use testing with healthcare professionals) and do not constitute clinical data under MDR Article 2(48).
  • Published severity validation studies (non-comparative, maximum 16, with adaptation): APASI_2025, AUAS_2023, AIHS4_2023, and ASCORAD_2022. These retrospective, non-comparative studies apply the device's severity scoring algorithm to a curated image dataset and compare its output against independent expert dermatologist consensus; MINORS item 2 is adapted to assess the representativeness of the image dataset rather than consecutive patient enrolment. These publications provide Technical Performance evidence per MDCG 2020-1 Pillar 2.

Confirmatory phase (Pivotal Investigations)​

To demonstrate compliance with the GSPR regarding clinical performance, safety, and clinical benefit, with statistical significance if applicable.

Study-level vs. device-level acceptance criteria: reconciliation​

The Acceptance Criteria column of the table below lists the study-specific pre-specified endpoints for each confirmatory investigation — the criteria that were pre-declared for that individual investigation at the time of its CIP and that the study itself was powered against. These are not the same as the device-level acceptance criteria listed in the "Intended Clinical Benefits" table above (7GH/5RB/3KX). The device-level criteria are pooled / combined-evidence thresholds derived across the full evidence portfolio using the meta-analytic Extraction → Synthesis → Margin workflow documented in the "Acceptance-criteria derivation" appendix above. For example, MC_EVCDAO_2019 was powered against a study-level AUC ≥ 0.80 for melanoma detection as its own pre-specified primary endpoint; the device-level 7GH(c) melanoma pooled AUC criterion (AUC ≥ 0.81) is a separate meta-analytic target derived from the combined real-patient evidence. Each study's criteria serve as the per-study pass/fail determinant; aggregation to the device-level criteria follows the per-benefit SotA-derivation appendix above.

Study IdentificationStudy typeMain objectivesMilestonesAcceptance Criteria
Reference of the protocol: AIHS4_2025Type of study: Retrospective pivotal study
State of process: Completed
Study design: Retrospective, observational, longitudinal pivotal study
Primary objective: To evaluate the accuracy and reliability of the AIHS4 system, integrated into the device, by comparing it with clinical experts using IHS4 and a gold standard in the context of the phase 1 clinical trial M-27134-01 for Hidradenitis Suppurativa (HS).

Secondary objectives:
- To compare AIHS4 performance with interobserver agreement levels reported in the literature.
- To assess temporal variability in AIHS4 scoring across consecutive visits.
- To analyse AIHS4 performance by anatomical region.
Inclusion period:
June 4, 2024 to July 11, 2024

This study included 2 patients with 16 severity assessments.

Completion date:
July 11, 2024

Date of the study report: February 28, 2025.
- AIHS4 improves severity assessment compared to interobserver agreement.
- AIHS4 maintains temporal consistency across visits, and temporal variability is acceptable ≤15%.
- AIHS4 achieves high agreement in lesion classification across anatomical regions, ICC=70%.
Reference of the protocol: BI_2024Type of study: Supporting pre-market investigation (Pillar 3 §4.4 at Rank 11; MRMC simulated-use reader study)
State of process: Completed
Study design: Prospective, observational, cross-sectional MRMC simulated-use reader study.
Primary objective: Validate that the device improves diagnostic accuracy for generalised pustular psoriasis (GPP).

Secondary objectives: Validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of other dermatological skin conditions, such as hidradenitis suppurativa.
Initiation date: June 1, 2024

Inclusion period:
June 1, 2024 to September 15, 2024

This study included 15 practitioners (4 dermatologists and 11 PCPs) to assess 100 images of skin conditions.

Completion date:
September 15, 2024

Date of the study report: September 15, 2024
- An improvement of at least 10% in diagnostic accuracy for generalised pustular psoriasis (GPP) when used by primary care physicians, and at least 5% when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 10% in diagnostic accuracy for skin conditions when used by general practitioners, and at least 5% when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 14% in diagnostic sensitivity for skin conditions when used by general practitioners, and at least 7% when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 11% in diagnostic specificity for skin conditions when used by general practitioners, and at least 10% when used by dermatologists, compared to standard clinical practice.
- An improvement in diagnostic accuracy, sensitivity and specificity of both primary care and dermatologists in the diagnosis of rare dermatological conditions.
Reference of the protocol: COVIDX_EVCDAO_2022Type of study: Pivotal study
State of process: Completed
Study design: Prospective, observational, cross-sectional and pivotal study.
Primary objective: To ascertain the validity of the device in objectively and reliably tracking the progression of chronic dermatological conditions. This validation is deemed successful if the tool achieves a score of 8 or higher on the Clinical Utility Questionnaire (CUS).

Secondary objectives:
- Confirming that the utilization of the device elicits a high level of patient satisfaction, particularly in its remote application.
- Demonstrating that the implementation of the device leads to a reduction in face-to-face consultations, thereby optimizing healthcare resources and patient convenience.
- Validating the device's ability to consistently generate reliable condition monitoring, thereby establishing its trustworthiness as a monitoring system.
Initiation date: April 13, 2022

Inclusion period:
April 13, 2022 to October 23, 2023

This study included 5 practitioners to assess 160 patients with different skin conditions.

Completion date:
October 17, 2023

Date of the study report: October 23, 2023.
- A score of 8 or higher in the Clinical Utility Score (CUS) is filled by the medical staff.
- At least 75% of experts (dermatologists) must answer positively to the questions of Clinical Utility Questionnaire (CUS), Data Utility Questionnaire (DUQ), and Usability Questionnaire (SUS) (Experts' consensus criteria).
Reference of the protocol: DAO_Derivación_O_2022Type of study: Pivotal study
State of process: Completed
Study design: Prospective, observational, longitudinal and pivotal study.
Primary objective: To validate that the device is a valid tool for improving the adequacy of referrals to dermatology.

Secondary objectives:
- To validate that the device reduces costs in secondary care.
- To validate that the device reduces dermatology waiting lists.
- To validate that the device optimizes clinical flow in Osakidetza.
Initiation date: November 23, 2022

Inclusion period:
November 23, 2022 to May 6, 2025

This study included 127 patients with different skin conditions.

Completion date:
May 6, 2025

Date of the study report: May 22, 2025.
- Improve the adequacy of referrals to dermatology.
- A reduction of unnecessary referrals to dermatology (at least 15%).
- A sensitivity and a specificity equal to or superior to the primary care physician to identify necessary referrals.
- A sensitivity and a specificity equal to or superior to the primary care physician to identify necessary referrals in teledermatology.
- A reduction of waiting lists (at least 30% Warshaw et al. 2011).
- A reduction of the costs in secondary care.
- An AUC (Area Under the Curve) of at least 0.8 in the ROC curve for the device detecting malignancy.
- A PPV (Positive Predictive Value) of at least 0.4 and an NPV (Negative Predictive Value) of at least 0.8 for the device detecting malignancy.
Reference of the protocol: DAO_Derivación_PH_2022Type of study: Pivotal study
State of process: Completed
Study design: Prospective, observational, longitudinal and pivotal study.
Primary objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of multiple dermatological conditions.

Secondary objectives:
- Reduce and correct the referral of patients with skin pathologies from primary care to dermatology.
- Individualize and improve the ongoing training of general practitioners in the area of dermatology.
- Offer healthcare adapted to technological innovations.
- Measure the satisfaction of general practitioners with the device.
- Measure the satisfaction of dermatologists with the device.
Initiation date: June 24, 2022

Inclusion period:
June 24, 2022 to January 10, 2024

This study included 127 patients with different skin conditions.

Completion date:
January 10, 2024

Date of the study report: January 10, 2024.
- An improvement of diagnostic accuracy of at least 10% (Ferri et al. 2020) in general practitioners and dermatologists.
- An AUC of 0.8 detecting malignancy.
- A reduction of unnecessary referrals of at least 15% to dermatology (Warshaw et al. 2011).
- A positive view of at least 70% of the medical device as a useful tool to gather more patient information.
Reference of the protocol: IDEI_2023Type of study: Pivotal study
State of process: Completed
Study design: prospective observational study with both longitudinal and retrospective case series and pivotal study.
Primary objective: To validate that the device optimizes clinical flow and patient care processes, reducing the time and cost of care per patient, through greater precision in medical diagnosis and determination of the degree of malignancy or severity.

Secondary objectives:
- To demonstrate that the device improves the ability of healthcare professionals to detect malignant or suspected malignant pigmented lesions.
- Demonstrate that the device improves the ability and accuracy of healthcare professionals in measuring the degree of involvement of patients with female androgenic alopecia.
- Automate the initial triage/assessment process in patients consulting for pigmented lesions.
- To evaluate the reduction in the use of healthcare resources by the centre by reducing the number of triage consultations and direct referral of the patient to the appropriate consultation (aesthetic or dermatological).
- Evaluate the degree of usability of the device by the patient.
- Demonstrate that the device increases specialist satisfaction.
Initiation date: January 25, 2024

Inclusion period:
January 25, 2024 to August 23, 2024

This study included 204 patients with different skin conditions.

Completion date:
August 23, 2024

Date of the study report: October 20, 2024.
- An improvement of diagnostic accuracy of 10%.
- Scores equal to or greater than 70 on the System Usability Scale (SUS).
- An AUC equal to or greater than 0.8 detecting malignancy.
- A sensitivity equal to or greater than 80% and a specificity equal to or greater than 70% in detecting malignancy.
- A correlation (measured with unweighted Kappa) coefficient equal to or greater than 0.5 between the investigator's assessment of the severity of androgenic alopecia and the device's assessment.
Reference of the protocol: MC_EVCDAO_2019Type of study: Pivotal study
State of process: Completed
Study design: prospective, observational, cross-sectional and pivotal study.
Primary objective: To validate that the device for the identification of cutaneous melanoma in images of lesions taken with a dermatoscopic camera achieves the following values:
- AUC greater than 0.8
- Sensitivity of 80% or higher
- Specificity of 70% or higher

Secondary objectives:
- Validate the usefulness and feasibility of the device developed by the manufacturer in adverse environments with severe technical limitations, such as a lack of instrumentation or a lack of internet connection.
Initiation date: February 10, 2020

Inclusion period:
February 10, 2020 to November 13, 2023

This study included 105 patients with different skin conditions suspicious of malignancy.

Completion date:
November 13, 2023

Date of the study report: May 31, 2024.
- An AUC greater than 0.8
- A Sensitivity of 80% or higher
- A Specificity of 70% or higher
Reference of the protocol: PH_2024Type of study: Supporting pre-market investigation (Pillar 3 §4.4 at Rank 11; MRMC simulated-use reader study)
State of process: Completed
Study design: Prospective, observational, cross-sectional MRMC simulated-use reader study.
Primary objective: To validate that the information provided by the device increases the true accuracy of general practitioners in the diagnosis of multiple dermatological conditions.

Secondary objective:
- To validate the percentage of cases that should be referred according to the HCP with the information provided by the device.
- To validate the percentage of cases that could be handled remotely with the information provided by the device.
Initiation date: June 04, 2024

Inclusion period:
June 04, 2024 to September 13, 2024

This study included 9 PCPs to assess 30 images of skin conditions.

Completion date:
September 13, 2024

Date of the study report: October 13, 2024.
- An improvement of at least 10% in diagnostic accuracy for skin conditions when used by general practitioners compared to standard clinical practice.
- An improvement of at least 14% in diagnostic sensitivity for skin conditions when used by general practitioners compared to standard clinical practice.
- An improvement of at least 11% in diagnostic specificity for skin conditions when used by general practitioners compared to standard clinical practice.
- An improvement in diagnostic accuracy, sensitivity and specificity of primary care practitioners in the diagnosis of rare dermatological conditions.
Reference of the protocol: SAN_2024Type of study: Supporting pre-market investigation (Pillar 3 §4.4 at Rank 11; MRMC simulated-use reader study)
State of process: Completed
Study design: Prospective, observational, cross-sectional MRMC simulated-use reader study.
Primary objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of multiple dermatological conditions.

Secondary objective:
- To validate what percentage of cases should be referred according to the HCP with the information provided by the device.
- To validate what percentage of cases could be handled remotely with the information provided by the device.
- Confirm that the use of the medical device is perceived by specialists as being of great clinical utility.
Initiation date: June 01, 2024

Inclusion period:
June 01, 2024 to October 10, 2024

This study included 16 practitioners (6 dermatologists and 10 PCPs) to assess 29 images of skin conditions.

Completion date:
October 10, 2024

Date of the study report: October 18, 2024
- An improvement of at least 10% in diagnostic accuracy for skin conditions when used by general practitioners, and at least 5% when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 14% in diagnostic sensitivity for skin conditions when used by general practitioners, and at least 7% when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 11% in diagnostic specificity for skin conditions when used by general practitioners, and at least 10% when used by dermatologists, compared to standard clinical practice.
Reference of the protocol: MAN_2025Type of study: Supporting pre-market investigation (Pillar 3 §4.4 at Rank 11; MRMC simulated-use reader study)
State of process: Completed
Study design: Prospective, observational, cross-sectional MRMC simulated-use reader study (Fitzpatrick V–VI phototype representativeness).
Primary objective: To validate that the information provided by the device increases the true accuracy of healthcare professionals (HCPs) in the diagnosis of multiple dermatological conditions presented on Fitzpatrick phototype V-VI skin.

Secondary objectives:
- To validate the percentage of cases that should be referred according to the HCP with the information provided by the device.
- To confirm that the use of the device is perceived by specialists as being of great clinical utility across skin phototypes.
Initiation date: January 21, 2026

Inclusion period:
January 21, 2026 to April 17, 2026

This study included 16 healthcare professionals (primary analysis cohort; 19 enrolled, 3 screen failures) to assess 149 images of Fitzpatrick phototype V-VI skin presentations.

Sample-size rationale: two-sided McNemar paired-binary power calculation (baseline 60 %, target 70 %, α = 0.05, 1 − β = 0.80, 25 % discordant proportion) requiring approximately 200 paired observations; the conservative 5-reader × 149-image floor yields 745 paired observations (effective sample size ≈ 460 after accounting for an assumed reader-level intraclass correlation coefficient (ICC) of 0.15, with the primary-endpoint conclusion confirmed robust across an ICC sensitivity range of 0.05 – 0.30). Full derivation in R-TF-015-004 §Sample size.

Reference standard: published atlas diagnosis encoded as ICD-11, established prior to and independently of the investigation, consistent with the methodology of SAN_2024, BI_2024 and PH_2024. Atlas diagnoses may not all be histopathologically confirmed; the limitation and three-point mitigation narrative (image-set size, self-controlled paired design, cross-study methodological consistency) are documented in R-TF-015-004 §Reference standard (ground truth).

Completion date:
April 17, 2026

Date of the study report: as recorded on the signature block of the signed Clinical Investigation Report (R-TF-015-006 MAN_2025 instance).
- An improvement of at least 10% in diagnostic accuracy for skin conditions presented on Fitzpatrick phototype V-VI skin when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 14% in diagnostic sensitivity for skin conditions presented on Fitzpatrick phototype V-VI skin when used by dermatologists, compared to standard clinical practice.
- An improvement of at least 11% in diagnostic specificity for skin conditions presented on Fitzpatrick phototype V-VI skin when used by dermatologists, compared to standard clinical practice.
Reference of the protocol: NMSC_2025Type of study: Retrospective manufacturer-authored peer-reviewed clinical-performance publication (external specialist-clinic deployment context; Pillar 3 Tier 1; Rank 4)
State of process: Completed (peer-reviewed publication)
Study design: Retrospective observational diagnostic-accuracy study in an external specialist head-and-neck clinic setting.
Primary objective: To evaluate the device's clinical performance in detecting basal cell carcinoma (BCC) and cutaneous squamous cell carcinoma (cSCC) against histopathology ground truth on images acquired in a specialist head-and-neck clinic population, as published in Medela et al., European Archives of Oto-Rhino-Laryngology, 2025.

Secondary objectives:
- To characterise the device's sensitivity and specificity for BCC and cSCC detection in a specialist population.
- To provide external peer-reviewed Pillar 3 Clinical Performance evidence on the malignancy-detection capability.
Inclusion period:
As reported in Medela et al. 2025.

This publication reports the device's performance on 135 patients with BCC or cSCC from a specialist head-and-neck clinic.

Date of the publication: Medela et al., European Archives of Oto-Rhino-Laryngology, 2025.
- Publication-level diagnostic-accuracy metrics (sensitivity, specificity, AUC) for BCC and cSCC detection against histopathology ground truth.
- Contextual difference from the intended-use primary-care setting (specialist-clinic malignancy prevalence) is documented and accounted for in the per-study appraisal in the CER (R-TF-015-003, section "Per-study evidence appraisal").

PMS aspects that need regular updating in the clinical evaluation report​

According to the section 7 of the MEDDEV 2.7/1 rev4, the clinical evaluation plan should make it possible to identify PMS aspects that need to be updated in the clinical evaluation report. The following table identifies the PMS aspects that need to be updated in the CER.

This CEP supports the first CE-marking submission of the device under MDR (first commercialisation). The PMS aspects listed below are evaluated at this revision: the "device under evaluation" row is not yet applicable (no post-market data exist for the new device before first commercialisation), while the "equivalent device" row is applicable because the legacy predecessor's passive PMS data (≈ 250,000 diagnostic reports over ≥ 4 years of continuous deployment) enters this evaluation via MDCG 2020-5 equivalence at Rank 7 per MDCG 2020-6 §6.2.2.

PMS AspectsYesNoN/A
New clinical data available for the device under evaluation (pre-market pivotal portfolio: the ten Route C pivotal investigations and the NMSC_2025 peer-reviewed publication; post-market new clinical data is not yet applicable because the device under evaluation has not been commercialised)x--
New clinical data available for the equivalent device, if equivalence is claimedx--
New knowledge about known and potential hazards, risks, performance, benefits and claims, including: (i) data on clinical hazards seen in other products (hazard due to substances and technologies); (ii) changes concerning current knowledge / the state of the art, such as changes to applicable standards and guidance documents, new information relating to the medical condition managed with the device and its natural course, medical alternatives available to the target population; (iii) other aspects identified during PMS.--x

The legacy-predecessor equivalent-device PMS corpus — 21 active contracts, ≈ 250,000 diagnostic reports as denominator, 0 MDR Article 87 serious incidents, 0 FSCAs, 0 Article 88 trend reports triggered, 7 non-serious complaints — is appraised at Rank 7 per MDCG 2020-6 Appendix III and contributes safety-confirmation evidence that cross-cuts the three MDCG 2020-1 pillars. Full appraisal methodology, hazard distribution and trend analysis are documented in the legacy device PMS Report (R-TF-007-003) and summarised in the CER (R-TF-015-003) section "Post-market surveillance of the equivalent legacy device".

Post-Market Clinical Follow-up (PMCF)​

We have established a comprehensive Post-Market Clinical Follow-up (PMCF) programme as part of the continuous clinical evaluation process mandated by MDR Annex XIV Part B.

As detailed in R-TF-007-002 Post-Market Clinical Follow-up (PMCF) Plan, the program includes ten specific clinical activities (proactive data collection and targeted clinical studies) designed to confirm the long-term safety, performance, and clinical benefits of the device throughout its expected lifetime. These activities address three pre-certification evidence gaps identified during the clinical evaluation (Gaps 1, 2 and 3 below). Item 4 below is separately a §6.3 sufficient-evidence determination for two low-prevalence sub-indication categories (autoimmune dermatoses and genodermatoses); Activities D.1 and D.2 confirm and strengthen that determination in real-world deployment and are not invoked to fill a pre-certification evidence gap.

  1. Gap 1 (benefits 7GH sub-criterion c + 3KX sub-criterion a): Triage and Malignancy Prioritization — real-world operational confirmation that malignancy prioritization translates into reduced waiting times for high-risk patients (Activities A.1, A.2, A.3).

  2. Gap 2 (benefit 5RB): Automated Severity Assessment — prospective Clinical Performance validation of severity assessment algorithms across additional conditions (Activities B.1, B.2, B.3, B.4, B.5).

  3. Gap 3 (benefit 7GH, all sub-criteria): Core Diagnostic Performance and Stability Monitoring — ongoing verification that core diagnostic algorithms maintain accuracy and reliability post-market (Activities C.1, C.2).

  4. Low-prevalence sub-indication categories (autoimmune dermatoses and genodermatoses; benefit 7GH sub-criterion a — indication coverage): pre-certification evidence for these two low-prevalence categories (~3 % autoimmune + ~1 % genodermatoses of real-world dermatological presentations) is triangulated under MDCG 2020-6 §6.3 and is judged sufficient on the four-test analysis below. These categories are not declared as §6.5(e) acceptable gaps.

    1. Test 1 (Narrow and bounded scope): the two categories account for approximately 4 % of real-world dermatological presentations combined (autoimmune ~3 %, genodermatoses ~1 %). The remaining ~96 % carry the core benefit-risk determination independently.
    2. Test 2 (Core benefit-risk independence): the three declared clinical benefits (7GH Diagnostic Accuracy, 5RB Objective Severity Assessment, 3KX Care-Pathway Optimisation) are independently evidenced on the remaining ~96 % of presentations by the pre-market confirmatory clinical investigations (MC_EVCDAO_2019, COVIDX_EVCDAO_2022, DAO_Derivación_O_2022, DAO_Derivación_PH_2022, IDEI_2023, AIHS4_2025) and by the legacy-predecessor real-world evidence corpus (R-TF-015-012; legacy passive PMS Report R-TF-007-003).
    3. Test 3 (Adequate residual evidence for the two categories — two independently-scoring anchors):
      • Pillar 1 Valid Clinical Association: a dedicated structured literature review (22 load-bearing anchors scoring CRIT1–7 ≥ 15/21; 4 supporting-context anchors) appended to R-TF-015-011 State of the Art as a per-sub-category section establishes that image-based clinical recognition is an accepted clinical standard for the named conditions: discoid lupus erythematosus, lichen planus (cutaneous and nail; oral lichen planus retains a residual coverage note, see paragraph below), dermatomyositis, pemphigus vulgaris, bullous pemphigoid, mucous membrane pemphigoid, morphea, cutaneous vasculitis (autoimmune); ichthyoses, neurofibromatosis type 1 and type 2, tuberous sclerosis complex, epidermolysis bullosa (genodermatoses).
      • Pillar 2 Technical Performance: a per-epidemiological-group sub-analysis of the v27.5.1 ICD classifier on the held-out V&V test set (n = 36,321 images; 346 ICD-11 categories), measured on the device's stand-alone analytical output without a clinician in the loop and published in R-TF-028-006 AI Release Report §Per-Epidemiological-Group Performance, reports AUC 0.948 (95 % CI 0.941 – 0.954, N = 2,040 images across 38 autoimmune classes) and AUC 0.905 (95 % CI 0.886 – 0.924, N = 391 images across 31 genodermatoses classes). Both AUCs exceed the pre-specified ≥ 0.80 acceptance criterion inherited from the binary-indicator threshold in R-TF-028-002 AI Development Plan and both sit within the range of the six binary-indicator AUCs on the same test set (0.863 – 0.959).
      • MRMC is deliberately not invoked as the Test 3 anchor for these categories: the MDCG 2020-6 Appendix III appraisal set out elsewhere in this Plan (§Clinical Evidence, MDCG 2020-6 Appendix III appraisal — Rank 11) records that simulated-use multi-reader multi-case (MRMC) reader studies "do not constitute 'clinical data' under the strict MDR Article 2(48) definition". Pillar 1 literature + Pillar 2 analytical performance are the two independently-scoring Test 3 anchors; MRMC evidence for these sub-categories, where applicable, is Pillar 3 §4.4 supporting evidence only and is not load-bearing.
      • Scope boundary with adjacent coverage gaps: the MAN_2025 MRMC supporting investigation (Pillar 3 §4.4 at Rank 11; Fitzpatrick V–VI phototype representativeness) and the legacy-device real-world-evidence study R-TF-015-012 (Pillar 3 at Rank 8 primary with a supplementary case at Rank 4 for the quantitative endpoints) address different sub-indication and coverage considerations and are therefore not invoked as Test 3 anchors for autoimmune dermatoses or genodermatoses. Both studies remain part of the broader evidence envelope that carries the three declared benefits for the remaining ~96 % of presentations (Test 2 above) and are cited in the relevant sections of this Plan and of R-TF-015-003.
    4. Test 4 (PMCF confirms and strengthens): post-certification confirmation is pre-specified in the R-TF-007-002 PMCF Plan under Activities D.1 (autoimmune, prospective surveillance with interim analyses at defined case counts; primary safety-floor acceptance criterion Top-3 ≥ 60 % on the in-scope cohort together with a non-inferiority secondary criterion against the V&V-demonstrated Top-3; safety and surveillance triggers) and D.2 (genodermatoses, passive surveillance with safety and coverage triggers, with an early Pillar 3-equivalent performance readout from the legacy-predecessor post-market report corpus at the first PMS Update Report). The PMCF Plan additionally commits to analysing the autoimmune-dermatoses and genodermatoses slices of the ≈ 250,000 legacy-predecessor post-market report corpus in the first PMS Update Report R-TF-007-003, consistent with MDCG 2020-6 §6.3 under which PMCF confirms and strengthens an adequately-evidenced pre-certification base and is not invoked to fill or close pre-market evidence gaps.

    Why pre-certification Pillar 3 evidence is not required for these two sub-indications: Pillar 1 and Pillar 2 are two different kinds of test and neither substitutes for Pillar 3. The acceptability of the §6.3 sufficient-evidence determination for these sub-categories in the absence of pre-certification Pillar 3 evidence rests on three cumulative elements: (a) the device's output for these sub-categories is supporting-information-only — the CLAIM is qualified by the Device Output Warning (see R-TF-015-003 §Consolidated limitations of the device and the IFU Device Output Warnings) to the effect that the device provides a probability ranking within its broader ICD-11 output distribution, to be interpreted in the HCP's differential-diagnosis workup; (b) final-diagnosis responsibility for these sub-categories rests wholly with the healthcare professional and is confirmed by histopathological, serological or genetic testing outside the device's loop, consistent with the current standard of care for autoimmune dermatoses and genodermatoses; (c) Pillar 3 real-world clinical performance is pre-specified for post-certification confirmation in PMCF Activities D.1 and D.2, and the pre-certification determination does not depend on pre-certification Pillar 3 data because of (a) and (b). This narrowed-CLAIM-plus-HCP-responsibility-plus-PMCF-confirmation construction is the defensible §6.3 pathway for low-prevalence sub-indications whose definitive diagnosis relies on non-imaging modalities.

    The remaining residual Pillar 1 coverage item is oral lichen planus: the abstract-only anchor identified in the supplementary literature search was excluded because the full text was not publicly available for appraisal at the load-bearing threshold. This residual coverage item is routed to PMCF Activity D.1 for post-market confirmation and is recorded in the State of the Art appendix (R-TF-015-011 §Autoimmune and genodermatoses).

The ten PMCF activities are summarised at CEP level in the table below (per-activity methodology, target sample size, acceptance criterion, milestone / timeline, and contingency), with full details in R-TF-007-002.

Activity IDGap addressedMethodologyTarget sample sizeAcceptance criterionMilestone / timelineContingency if acceptance criterion breached
A.1Gap 1 — triage & malignancy prioritisation (7GH(c), 3KX(a))Prospective registry-based collection of malignancy-prioritisation time-to-specialist data in deployed primary-care sites≥ 300 malignancy-prioritised referralsMedian time-to-specialist ≤ published SotA baseline × 0.50 (≥ 50 % reduction, mirroring 3KX(a) device-level criterion)First interim readout 12 months post-certification; full readout at 24 monthsUnscheduled CEP and CER update; root-cause analysis; Article 88 trend-report assessment
A.2Gap 1 — referral-appropriateness confirmation (3KX(b))Real-world observational cohort in primary care with specialist adjudication of referral necessity≥ 500 referral eventsReduction in unnecessary referrals ≥ 25 % vs. pre-deployment baseline (lower bound of 3KX(b) range)Interim at 12 months; final at 24 monthsAs per A.1
A.3Gap 1 — remote-care adequacy (3KX(c))Prospective remote-care adequacy registry across teledermatology deployments≥ 200 teledermatology consultations≥ 50 % of patients handled remotely with no downstream escalation within 30 days (lower bound of 3KX(c))Interim at 12 months; final at 24 monthsAs per A.1
B.1Gap 2 — HS severity (5RB)Prospective multi-centre ICC-agreement study against expert HS consensus≥ 50 patients, ≥ 200 longitudinal assessmentsICC ≥ 0.7270 maintained in real-world workflowInterim at 18 months; final at 30 monthsTargeted follow-up study; CEP/CER update
B.2Gap 2 — PASI severity (5RB)Prospective psoriasis-cohort RMAE study against expert PASI consensus≥ 60 patients, ≥ 180 longitudinal assessmentsRMAE ≤ 15 % in real-world workflowInterim at 18 months; final at 30 monthsAs per B.1
B.3Gap 2 — UAS severity (5RB)Prospective urticaria-cohort RMAE study≥ 40 patients, ≥ 120 longitudinal assessmentsRMAE ≤ 15 % in real-world workflowInterim at 18 months; final at 30 monthsAs per B.1
B.4Gap 2 — SCORAD severity (5RB)Prospective atopic-dermatitis RMAE study≥ 50 patients, ≥ 150 longitudinal assessmentsRMAE ≤ 15 % in real-world workflowInterim at 18 months; final at 30 monthsAs per B.1
B.5Gap 2 — androgenetic alopecia severity (5RB)Prospective androgenetic-alopecia kappa-agreement study≥ 50 patients, ≥ 100 assessmentsUnweighted kappa ≥ 0.60; correlation ≥ 0.65Interim at 18 months; final at 30 monthsAs per B.1
C.1Gap 3 — core diagnostic performance monitoring (7GH)Continuous algorithmic-performance monitoring against deployed reference labels across the 346 ICD-11 analytical spaceRolling, all deployed sitesTop-5 ≥ 70 %; Top-3 ≥ 55 %; Top-1 ≥ 40 %; AUC > 0.8 for malignancy (thresholds per R-TF-007-002)Rolling quarterly readouts post-certificationUnscheduled CEP / CER update; algorithmic-performance threshold-breach procedure in R-TF-007-002
C.2Gap 3 — stability monitoring across paediatric / phototype strataAge-stratified (0-2, 2-12, 12-18 years) and Fitzpatrick-phototype-stratified (I-II / III-IV / V-VI) Top-N performance trackingRolling, all deployed sitesPer-stratum Top-3 accuracy within the per-band threshold defined in R-TF-007-002 relative to the overall cohortRolling quarterly readoutsAs per C.1
D.1 / D.2 (autoimmune + genodermatoses)Low-prevalence sub-indication categories (§6.3 triangulated pre-certification evidence; PMCF confirms and strengthens)Prospective surveillance (D.1, autoimmune) and passive surveillance (D.2, genodermatoses) of the autoimmune and genodermatoses ICD-11 subsets, with retrospective analysis of the device output for each identified case; D.2 additionally draws an early Pillar 3-equivalent performance readout from the legacy-predecessor post-market report corpus in the first PMS Update Report (R-TF-007-003)D.1: target 50 confirmed autoimmune cases within 36 months post-certification; D.2: governed by surveillance and coverage triggers plus the early legacy-corpus readout (no prospective enrolment target)D.1 primary safety-floor Top-3 ≥ 60 % with non-inferiority secondary (Top-3 ≥ 0.67; 15 pp below V&V-demonstrated 0.820); D.2 primary safety zero-harm with per-case Top-5 concordance reporting and the early legacy-corpus readoutD.1: first interim at 12 months post-certification (or at 15 cases, whichever comes first); final at 36 months or 50 cases. D.2: rolling, annual PMCF Evaluation Report integration; early legacy-corpus readout at approximately 6 months post-certification in R-TF-007-003Unscheduled CER update; protocol-driven re-review of the §6.3 sufficient-evidence determination if any primary, non-inferiority, safety or surveillance trigger breaches. No claim-scope review required (Pillar 1 + Pillar 2 anchors carry the pre-certification determination independently of the D-series readouts)
E / F (Fitzpatrick V–VI + paediatric)§3.4 phototype representativeness; paediatric representativeness (§6.5(e) acceptable gaps declared in the CER §Representativeness of the Study Populations and §Pediatric population)Phototype-stratified real-world performance tracking (including Fitzpatrick V–VI post-MAN_2025 transition to real-world deployment); age-stratified paediatric performance tracking (0–2, 2–12, 12–18 years) within the C-series activitiesDriven by real-world case accrual in deployed sitesDeclared acceptable-gap thresholds per R-TF-007-002 (§6.5(e) MDCG 2020-6)Rolling quarterly readouts; trigger-based reviewCEP / CER update; claim-scope review if accrual persistently below threshold

The PMCF program ensures that the benefit-risk profile remains acceptable and that the device continues to meet the state-of-the-art requirements in real-world clinical use.

Clinical Evidence​

The Appendix III of the MDCG 2020-6 guidance document provides a hierarchy of the clinical evidence and considerations to apply, ranked roughly in order from strongest to weakest. The table below presents this hierarchy by listing the types of clinical data used in the context of this clinical evaluation.

RankTypes of clinical data and evidenceUsed?MDCG 2020-1 Pillar the rank populatesType of data used
1Results of high quality clinical investigations covering all device variants, indications, patient populations, duration of treatment effect, etc.NoN/ARank 1 is not achievable at first CE-marking for any new device of this breadth; per MDR Article 61(1) the level of evidence is proportionate to the device's risk class and intended purpose. The risk-proportionate evidence strategy draws from Ranks 2, 4, 5, 6, 7, 8 and 11 as documented below
2Results of high quality clinical investigations with some gapsYesPillar 3 Clinical Performance (primary)Prospective clinical studies in real clinical settings: MC_EVCDAO_2019, COVIDX_EVCDAO_2022, DAO_Derivación_O_2022, and the prospective arm of IDEI_2023. Gaps justified by risk assessment and tiered evidence structure; addressed by PMCF
3Outcomes from high quality clinical data collection systems such as registriesNoN/ANA
4Outcomes from studies with potential methodological flaws but where data can still be quantified and acceptability justifiedYesPillar 3 Clinical Performance (primary, with methodological limitations quantifiable)Retrospective arm of IDEI_2023 (mixed prospective/retrospective design), DAO_Derivación_PH_2022 (protocol deviation affecting primary endpoint), NMSC_2025 (BCC/cSCC detection in specialist head-and-neck clinic). AIHS4_2025 (2 patients, 16 assessments) is carried separately as a proof-of-concept / pilot feasibility study for Benefit 5RB — it does not support a CE-marking severity claim on its own and is not included in the Rank 4 pivotal set. The quantitative endpoints of R-TF-015-012 (legacy-device post-market observational study) are carried at Rank 8 by default, with a case for the Appendix III Rank 4 "high quality surveys may also fall into this category" reading presented as a supplementary position (see "Summary of the Combined Strategy → Rank-8 primary classification of R-TF-015-012"). Data remains quantifiable and clinically meaningful.
5Equivalence data (reliable / quantifiable)YesPillar 3 Clinical Performance (equivalence-derived)Equivalence assessment with the legacy predecessor; carries legacy pivotal-investigation clinical data into the device's evaluation per MDCG 2020-5 and MDR Article 61(5)-(6). The per-study appraisal tables are maintained in the CER (R-TF-015-003) section "Per-study evidence appraisal".
6Evaluation of state of the art, including evaluation of clinical data from similar devices; peer-reviewed publications on the device's own algorithms appraised with MINORS ≥12YesPillar 1 Valid Clinical Association and Pillar 2 Technical Performance (SotA literature and peer-reviewed severity-validation publications)Published studies in the literature and collected in the state of the art (R-TF-015-011), plus the peer-reviewed severity-validation publications APASI_2025, AUAS_2023, AIHS4_2023 and ASCORAD_2022 which supply external peer-reviewed Pillar 2 evidence on the device's severity-scoring outputs. Each of the four severity-validation publications is appraised with MINORS ≥12 as the methodological-quality appraisal layered on the Rank 6 classification. The per-study appraisal tables are maintained in the CER (R-TF-015-003).
7Complaints and vigilance data; curated dataYesSafety confirmation — cross-cut to the three MDCG 2020-1 MDSW pillars (MDR Article 61(1); Annex I §§1, 3, 4, 8; MDCG 2020-6 §§6.1 and 6.3; MEDDEV 2.7/1 Rev 4 §§A7.2 and A7.4; ISO 14971:2019 §§7, 8, 10; MDR Articles 83, 86, 87, 88); see §Definitions "Safety confirmation"Vigilance and post-market surveillance data from the legacy predecessor device (2020-present): 21 active contracts, ≈ 250,000 diagnostic reports as denominator, 0 MDR Article 87 serious incidents (rule-of-three upper one-sided 95% bound ≤ 0.0012 %), 0 FSCAs (upper one-sided 95% bound ≤ 0.0012 %), 0 Article 88 trend reports triggered, 7 non-serious complaints (3 Category 3a customer-reported events including 1 clinical-output accuracy feedback and 2 API-availability events; 2 Category 4 non-safety complaints; 2 additional non-serious complaints — all closed). Counts are presented with denominators, hazard distribution and trend analysis as required by MDCG 2020-6 §6.3. An inline IMDRF MDCE WG/N56 Appendix F vigilance-denominator appraisal is summarised in "Appendix F appraisal of the legacy predecessor vigilance denominator" immediately below; full appraisal methodology in the legacy device PMS Report (R-TF-007-003).
8Proactive PMS data, such as that derived from surveysYesPillar 3 Clinical Performance (supporting; Likert professional-opinion)Proactive PMS data (CUS, DUQ, SUS) collected during clinical investigations (e.g., COVIDX_EVCDAO_2022); and the Likert professional-opinion items (questionnaire items B1, B3, B5, C1–C3, D1, D3, D5, E1, F3) of the post-market cross-sectional observational study of the equivalent legacy device (R-TF-015-012), classified at Rank 8 per MDCG 2020-6 Appendix III
9Individual case reports on the subject deviceNoN/ANA
10Compliance to non-clinical elements of common specifications considered relevant to device safety and performanceNoN/ANA
11Simulated use / animal / cadaveric testing involving healthcare professionals or other end usersYesPillar 3 Clinical Performance (MDCG 2020-1 §4.4 supporting)MRMC studies: BI_2024 (15 HCPs × 100 images), PH_2024 (9 PCPs × 30 images), SAN_2024 (16 HCPs × 29 images), MAN_2025 (16 HCPs × 149 images in the primary analysis cohort; 19 enrolled with 3 documented screen failures, data-lock 17 April 2026; Fitzpatrick V–VI). These do not constitute "clinical data" under the strict MDR Article 2(48) definition (no live-patient data collection) and therefore sit below the prospective real-patient studies in the evidence hierarchy. Per MDCG 2020-1 §4.4 they contribute to Pillar 3 Clinical Performance as supporting evidence, by demonstrating that intended users achieve clinically relevant outputs on images representative of the intended patient population
12Pre-clinical and bench testing / compliance to standardsYesPillar 2 Technical PerformanceVerification and validation tests (usability, cybersecurity, software development); standards-compliance evidence. The rank-12 placement reflects MDCG 2020-6 Appendix III's residual categorisation of non-clinical bench testing and standards-compliance evidence; it does not diminish the Pillar 2 analytical-evidentiary weight of the AI model verification-and-validation against the manufacturer's curated labelled image database (R-TF-028-005), which is the primary Pillar 2 analytical anchor for the 346-ICD-11 stand-alone analytical claim and is supplemented at Rank 6 by the peer-reviewed severity-validation publications (APASI_2025, AUAS_2023, AIHS4_2023, ASCORAD_2022).

Appendix F appraisal of the legacy predecessor vigilance denominator​

MDCG 2020-6 §6.3 cautions that complaint/incident ratios alone are not sufficient for safety conclusions. The Rank 7 legacy predecessor vigilance denominator declared above is accordingly appraised inline at the CEP level against IMDRF MDCE WG/N56 Appendix F, with full methodology retained in the legacy device PMS Report (R-TF-007-003):

  • Denominator construction: the ≈ 250,000-report denominator is constructed as one record per diagnostic assessment executed by the legacy predecessor across its 21 active commercial contracts over the 2020–present surveillance window. The unit is the individual diagnostic-assessment event (not the patient, the contract or the calendar month), aligned with the legacy predecessor's clinical-use model.
  • De-duplication: repeated assessments on the same patient within the same clinical episode are de-duplicated at the contract level using the contract's internal assessment identifier where available; where contract-side identifiers are unavailable, timestamp-and-case-identifier hashing is applied. The de-duplication procedure is documented in the legacy umbrella PMS Plan (R-TF-007-005).
  • Completeness of complaint capture: completeness is verified across all 21 active contracts via the standing post-market vigilance channel mandated by the legacy umbrella PMS Plan (R-TF-007-005), which requires each contract holder to route any device-related complaint, incident or field safety action to the manufacturer's vigilance desk within the notification windows specified by MDR Articles 87 and 88. Completeness checks are executed at each PMS cycle and retained in R-TF-007-003.
  • Outcome classification (MDR Article 87 / Category 3a / Category 4): each complaint is triaged against the MDR Article 87 serious-incident definition; non-serious events are further classified as Category 3a (customer-reported events with a potential clinical or performance-related connection to the device) or Category 4 (non-safety complaints with no clinical or performance implication). The triage methodology and the classification rubric are documented in the legacy umbrella PMS Plan and reapplied consistently across the ≈ 250,000-report denominator.
  • Scope of conditions covered: the denominator covers the legacy predecessor's deployed ICD-11 range as declared in its MDD indication scope. Coverage is not uniform across the 346 ICD-11 categories within scope of the device under evaluation: high-prevalence categories (e.g., inflammatory, infectious, general other) are over-represented in the denominator relative to low-prevalence categories (e.g., autoimmune, genodermatoses). This distributional skew is documented per category in R-TF-007-003 and is a declared limitation of the Rank 7 safety-confirmation evidence for low-prevalence subgroups; low-prevalence coverage is addressed separately by the §6.3 triangulated pre-certification evidence declared under "PMS aspects" (autoimmune + genodermatoses sub-indication coverage) and by PMCF Activities D.1 and D.2.
  • Consequence for safety-confirmation sufficiency: the Appendix F appraisal confirms that the Rank 7 denominator is a defensible safety-confirmation source under MDCG 2020-6 §6.3 with the declared coverage skew, and that the rule-of-three zero-event upper bound (≤ 0.0012 %) applies to the denominator as constructed. Post-market confirmation is continued under the PMCF programme at the per-activity cadence declared above.

Clinical Concerns​

Identification and Evaluation Process​

Clinical concerns are systematically identified through:

  1. Literature Review: Critical appraisal of published clinical evidence for the device and similar technologies
  2. Risk Management Analysis: Evaluation of known and foreseeable hazards from R-TF-013-002
  3. Regulatory Monitoring: Review of applicable standards (ISO 14971, MEDDEV 2.7/1) and MDCG guidance documents
  4. Stakeholder Feedback: Ongoing engagement with clinical users and subject matter experts
  5. Post-Market Surveillance: Monitoring of adverse events and performance data from commercial devices

Current Status​

Until now, no unmanaged clinical concerns have been raised that would impede CE-marking submission. All identified residual risks have been:

  • Adequately characterized in the Risk Management File.
  • Mitigated through design controls, labeling, and training requirements.
  • Included in planned clinical investigations for validation.

Mechanism for Future Updates​

The clinical evaluation is updated annually, in alignment with the Periodic Safety Update Report (PSUR) cycle for this Class IIb device, as formally defined in our Clinical Evaluation procedure (GP-015). This frequency ensures continuous updating based on clinical data obtained from the implementation of the PMCF plan and the post-market surveillance plan, per Article 61(11) of Regulation (EU) 2017/745 (MDR). This annual cadence is driven by the PSUR update frequency mandated by Article 86 of the MDR.

This annual frequency is also consistent with the tiered update options provided in Section 6.2.3 of MEDDEV 2.7/1 Rev 4, as endorsed by MDCG 2020-6 Appendix I, while ensuring full compliance with the primary MDR requirements for Class IIb devices.

The first update is scheduled for one year after initial CE marking, ensuring alignment with the first PSUR and the results of the initial PMCF cycle.

Unscheduled updates of this CEP and of the corresponding CER are triggered whenever any of the following occurs — mirroring the trigger set declared in the CER's "PMCF update triggers" table. Triggers are grouped by lifecycle phase: pre-certification triggers prompt a CEP and SotA revision before the current submission is closed; post-certification triggers prompt an unscheduled CEP and CER update after CE marking.

Pre-certification triggers (applicable before the first CE mark is issued):

  • Any new peer-reviewed publication that changes the SotA baseline used to derive a device-level acceptance criterion in this CEP; such a publication triggers re-derivation of the affected acceptance criterion and a corresponding SotA and CEP revision before submission.
  • Any new competitor device receiving CE marking or FDA clearance in an overlapping indication between the submission and the decision, prompting a comparator-register refresh in R-TF-015-011 and an impact review in this CEP.
  • Any new guideline document redefining standard of care in the device's indication scope, prompting a SotA re-anchoring and a corresponding CEP revision before submission.

Post-certification triggers (applicable after CE marking, alongside the annual PSUR-aligned cadence):

  • Any serious incident per MDR Article 87.
  • Any Field Safety Corrective Action (FSCA) issued under MDR Article 87 / Article 95 for the device or for any device identified as equivalent or comparable.
  • Any trend-report threshold breach in post-market surveillance under MDR Article 88.
  • Any algorithmic-performance threshold breach per R-TF-007-002 (AUC >0.8 for malignancy; Top-5 ≥70%; Top-3 ≥55%; Top-1 ≥40%), or any other PMS/PMCF acceptance-criterion breach.
  • Any new SotA evidence emerging after certification with the potential to change the current evaluation (new peer-reviewed publication, new guideline, or new competitor approval).

When triggered, the emergent concern will be:

  1. Documented in this CEP as an updated revision.
  2. Assessed for impact on benefit-risk determination.
  3. Addressed through additional clinical investigations or risk-mitigation measures, if necessary.
  4. Reflected in an updated Clinical Evaluation Report (R-TF-015-003).

CEP Completeness Verification​

The following checklist confirms that this Clinical Evaluation Plan addresses all mandatory requirements from MDR Annex XIV Part A, Section 1:

MDR Annex XIV RequirementAddressed in CEPSection/Document
1(a) Clinical development plan✓"Clinical Development Plan" section of this CEP; R-TF-015-004 (CIPs) for each pivotal investigation
1(b) State of the art analysis✓"State of the art" section of this CEP; R-TF-015-011 (State-of-the-Art record)
1(c) Risk analysis and risk management✓"Risk management" section of this CEP; R-TF-013-001 (Risk Management Plan); R-TF-013-002 (Risk Management File); R-TF-013-003 (Risk Management Report)
1(d) Identification and assessment of clinical data✓"State of the art → Literature search" and "Literature appraisal data" sections of this CEP; R-TF-015-011
1(e) Benefit-risk determination✓"Acceptability of the benefit-risk ratio" section of this CEP; R-TF-015-003 (benefit-risk conclusion)
1(f) Evaluation team qualifications✓"Responsibilities: competence of the clinical evaluation team" section of this CEP; Annex I (CVs and Declarations of Interest)
1(g) PMS/PMCF plan✓"Post-Market Clinical Follow-up (PMCF)" section of this CEP; R-TF-007-001 (PMS Plan); R-TF-007-002 (PMCF Plan)
1(h) Method for updating evaluation✓"Mechanism for Future Updates" section of this CEP

Conclusion​

This CEP fully satisfies MDR Article 61(1) and Annex XIV Part A requirements for Stage 0 of the clinical evaluation process. All mandatory elements have been documented and evidence has been compiled to support conformity assessment with the General Safety and Performance Requirements #1, #8, and #17.

Annexes​

  • Annex I: CV AND DECLARATIONS OF INTEREST.

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

  • Author: Team members involved
  • Reviewer: JD-018 Clinical Research Coordinator
  • Approver: JD-022 Medical Manager
Previous
Declaration of Interests-Saray Ugidos
Next
R-TF-015-003 Clinical Evaluation Report
  • Executive Summary
    • How to read this CEP
    • Document Overview
    • Key Evaluation Objectives
    • Clinical Development Status
    • Regulatory Pathway
  • Purpose
  • Scope of the clinical plan as part of the clinical evaluation
  • Clinical Evaluation Strategy
    • Overview of the Chosen Approach
    • Route A: Systematic Literature Review
      • Regulatory basis
    • Route B: Equivalence with the Legacy Device
      • Regulatory basis
      • Load-bearing nature of this route
      • Article 61(5)(b) access condition
      • MDCG 2020-5 three-characteristic demonstration
      • Summary of the difference set between the device and the legacy predecessor
      • Inline summary equivalence table
      • Consequence for the clinical evidence portfolio
    • Route C: Own Clinical Investigations
      • Regulatory basis
      • Portfolio status at the date of this CEP
    • Summary of the Combined Strategy
      • Rank-8 primary classification of R-TF-015-012, with supplementary Rank-4 case
  • References
  • Acronyms and definitions
    • Acronyms
    • Definitions
  • Responsibilities: competence of the clinical evaluation team
    • Individual-experience baseline under MEDDEV 2.7/1 Rev 4 §6.4
    • MEDDEV 2.7/1 Rev 4 §6.4 four-competence coverage
    • External methodological review
    • Subject Matter Expert coverage across the indication scope
  • Identification of relevant product requirements
    • Coverage of additional GSPRs requiring clinical data for an MDSW
  • Description
    • Device identification
    • Manufacturer identification
    • Contraindications and precautions required by the manufacturer
      • Contraindications
      • Precautions
    • Warnings
    • Undesirable effects
    • Intended clinical benefits
      • Clarification on "Multiple conditions"
        • Worked example: derivation of the malignancy detection acceptance criterion
        • Acceptance-criteria derivation: per-benefit SotA anchors and margins
        • Surrogate-to-patient-outcome chain for benefit 3KX
      • Evaluation of the intended clinical benefits
      • Integrator integration requirements as risk controls
    • Device classification
    • Product category
    • Device variants and packaging
    • Previous version of the device
    • Components
    • Mode of action
    • Device lifecycle
    • Expected lifetime
    • Degree of Novelty
      • Clinical or surgical procedure novelty dimensions
      • Device-related novelty dimensions
      • Novelty conclusion
    • Clinical Performance Claims
      • Pooled performance-metric methodology
  • Risk management
    • Device-specific hazards applicable to the device
    • Risk mitigation measures
    • Risk Summary
      • Total identified risks
      • Risk mitigation effectiveness
    • Safety endpoints
      • Dual anchoring of safety acceptance criteria
    • Acceptability of the benefit-risk ratio
      • Benefit-risk determination methodology
  • State of the art
    • Scope
    • Literature search
      • Literature search protocol
    • Source of data and search description
      • Refresh cadence
      • Vigilance databases
      • Registries
    • Selection Methodology and Criteria
    • Literature appraisal data
      • Appraisal plan
      • Appraisal and weighting criteria for the State of the Art literature (CRIT1-7)
      • Inclusion threshold and rationale
  • Clinical Development Plan
    • Purpose
    • Phased progression of the clinical evaluation
    • Current State of the Evidence
      • Non-clinical test results: bench testing
        • Software Design Verification
        • AI Model Validation and Testing
        • Usability Engineering
      • Existing clinical data
      • Clinical evidence assessment strategy
        • Regulatory framework and applicable guidance
    • Regulatory pathway for the device and Article 120 status of the legacy device
      • MDR Article 83 framing for the legacy predecessor's post-market observational study
    • Guidance framework applied
      • Evidence quality hierarchy (MDCG 2020-6 Appendix III)
      • Three-pillar evidence framework for MDSW (MDCG 2020-1)
      • Planned tiered evidence assessment
      • Planned paediatric subgroup analysis
        • Discharge of the GSPR obligation for the paediatric subpopulation
      • Planned evidence classification per study
      • Summary distribution of planned evidence across Rank and Pillar
        • Reading the summary
        • Causal chain across the three pillars
      • Published severity validation literature (MDCG 2020-1, Pillar 2)
      • Published clinical performance literature (MDCG 2020-1, Pillar 3)
      • Planned appraisal methodology for clinical investigations and published manuscripts
    • Confirmatory phase (Pivotal Investigations)
      • Study-level vs. device-level acceptance criteria: reconciliation
    • PMS aspects that need regular updating in the clinical evaluation report
    • Post-Market Clinical Follow-up (PMCF)
  • Clinical Evidence
    • Appendix F appraisal of the legacy predecessor vigilance denominator
  • Clinical Concerns
    • Identification and Evaluation Process
    • Current Status
    • Mechanism for Future Updates
  • CEP Completeness Verification
    • Conclusion
  • Annexes
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI Labs Group S.L.)