Response

The sufficiency of the clinical evidence is justified through a structured, risk-proportionate analysis addressing quantity, quality, coverage of indications, coverage of patient populations, and identified gaps. The justification framework follows MDCG 2020-6 (evidence hierarchy and gap management), MEDDEV 2.7.1 Rev 4 Sections 9–10 (appraisal and analysis methodology), and MDCG 2020-1 (MDSW clinical evidence requirements). We have updated R-TF-015-003 (Clinical Evaluation Report) with a dedicated "Justification of sufficiency of clinical evidence" section that presents this analysis in full.

Per MDCG 2020-1, sufficiency for medical device software is demonstrated across three pillars: Valid Clinical Association (VCA), established through the systematic literature review in R-TF-015-011 linking the visual morphological features processed by the AI model to the ICD-11 dermatological categories; Technical Performance, demonstrated through algorithm validation studies and the AI development report (R-TF-028-005); and Clinical Performance, demonstrated through the clinical investigations described below and through the post-market observational study of the equivalent legacy device under R-TF-015-012, both conducted in real-world clinical settings with real patients. The evidence base, comprising Rank 2–4 pre-market clinical investigations, Rank 4 post-market observational evidence from the equivalent legacy device (R-TF-015-012, conducted under MDCG 2020-6 § 6.2.2), Rank 7 passive legacy market surveillance data consolidated in R-TF-007-003, Rank 8 Likert professional-opinion data from the same post-market study, Rank 11 simulated-use reader studies, and published Technical Performance evidence (MDCG 2020-1 Pillar 2), is sufficient to support initial CE marking. PMCF activities provide essential prospective confirmation per MDCG 2020-6 § 6.4, with targeted activities for the two declared acceptable gaps (Section 3 below) as explicitly permitted under MDCG 2020-6 § 6.5(e).

1. Evidence quantity and quality: characterised by MDCG 2020-6 Appendix III ranking

The clinical evidence portfolio comprises nine pre-market clinical investigations involving over 800 patients and 60+ healthcare professionals across multiple clinical settings, plus a post-market cross-sectional observational study of the equivalent legacy device (R-TF-015-012, a study-specific Protocol nested inside the legacy umbrella PMS Plan R-TF-007-005) conducted at 21 clinical sites with 60 responses collected and an analysis set of N = 56 after application of the pre-specified evidence-quality substantiation principle stated in the protocol's Section 10.7 (requiring Section F safety responses to be substantiated by descriptive elaboration), plus the consolidated passive legacy-market surveillance data reported in the legacy umbrella PMS Report (R-TF-007-003). We have updated the CER to characterise each evidence source by its MDCG 2020-6 Appendix III rank, rather than applying a uniform quality label:

Rank 2 (high-quality prospective clinical investigations): MC_EVCDAO_2019 (105 patients, melanoma-suspected lesions, prospective, single-centre), COVIDX_EVCDAO_2022 (160 patients, prospective, 6 dermatologists), and DAO_Derivación_O_2022 (117 patients, prospective, primary care referrals). These constitute the primary pre-market clinical evidence.
Rank 4 (pre-market clinical investigations with acknowledged methodological limitations, data quantifiable and acceptability justifiable): IDEI_2023 (202 patients, pigmented lesions and alopecia, prospective and retrospective) and DAO_Derivación_PH_2022 (131 patients, prospective, primary care referrals). These studies were conducted in real clinical workflows and constitute clinical data per MDR Article 2(48).
Rank 4 (post-market cross-sectional observational study of the equivalent legacy device, classified as a "high quality survey" under MDCG 2020-6 Appendix III): R-TF-015-012, a retrospective cross-sectional observational study conducted under a formal PMS Study Protocol at all 21 legacy device client institutions, with 60 responses collected and an analysis set of N = 56 (34 dermatologists, 13 primary care physicians, 9 hospital managers) after four responses were excluded as unsubstantiated safety flags under the protocol's Section 10.7 evidence-quality substantiation principle. The study's methodological limitations are acknowledged in full in the CER and in the companion study report — physician-reported perceived outcomes rather than independently measured patient outcomes, potential recall bias partially mitigated by a pre-specified data-source sensitivity analysis (aggregate records-consulted proportion 36.0%, exceeding the protocol's ≥ 30% threshold), and a non-randomised cross-sectional design partially mitigated by pre-specified published SotA comparators. The study protocol was adopted on 7 November 2025 under the manufacturer's standing MDR Article 83 proactive PMS programme for the equivalent legacy device. The study defines three co-primary endpoints — one per declared clinical benefit (B2, C4, D4) — pre-specified MCID thresholds derived from published SotA, published SotA comparators per endpoint, a Holm-Bonferroni multiplicity correction for family-wise error control at α = 0.05, a sensitivity analysis stratifying results by data source reliability (record-consulted vs. professional estimate), safety surveillance (Section F of the instrument), and transparently acknowledged methodological limitations. All three co-primary endpoints exceed their MCIDs after Holm-Bonferroni correction. The quantitative study outcomes qualify for Rank 4 per the MDCG 2020-6 Appendix III considerations column, which explicitly states that "high quality surveys may also fall into this category." The study is introduced into the CER as post-market clinical evidence per MDCG 2020-6 § 6.2.2 and complies with the gap-bridging expectations of MDCG 2020-6 § 6.5(e).
Rank 11 (simulated use with healthcare professionals): BI_2024 (100 images, 15 HCPs), PH_2024 (30 images, 9 PCPs), SAN_2024 (29 images, 16 practitioners). These multi-reader, multi-case (MRMC) studies are classified as simulated use per MDCG 2020-6 Appendix III. They do not constitute "clinical data" under the strict MDR Article 2(48) definition (no live-patient data collection) and therefore sit at a lower evidence rank than the prospective real-patient studies; however, per MDCG 2020-1 §4.4 they contribute to Pillar 3 Clinical Performance because they demonstrate that intended users achieve clinically relevant outputs on images representative of the intended patient population.
Rank 7 (complaints and vigilance data): The equivalent legacy device has been in clinical use since 2020, generating approximately 250,000 clinical reports across 21 contracts. The consolidated legacy umbrella PMS Report (R-TF-007-003, prepared per MDR Article 85 applicable via MDR Article 120(3); paired with the legacy umbrella PMS Plan R-TF-007-005) records zero Article 87 serious incidents, zero Article 88 trend reports, zero FSCAs and zero product recalls across the reporting period; three customer-reported events (one clinical-output accuracy feedback and two API availability events) and two non-safety complaints, all closed, with no patient harm reported in any case. This data is clinical data per MDR Article 2(48), appraised using IMDRF MDCE WG/N56 Appendix F quality criteria as endorsed by MDCG 2020-6 Appendix I. The use of this legacy market data as clinical evidence is grounded in MDCG 2020-6 § 6.2.2, which addresses the use of post-market data from legacy devices in the clinical evaluation of the transitioning device.
Rank 8 (proactive PMS data, such as surveys and professional opinion): The Likert professional-opinion items of R-TF-015-012 (B1, B3, B5, C1–C3, D1, D3, D5, E1, F3) contribute as supporting evidence alongside the study's Rank 4 quantitative outcomes. The Likert data is interpreted alongside, not in place of, the objective performance metrics generated by the pre-market clinical investigations and the post-market quantitative endpoints.

The sufficiency argument rests on the portfolio as a whole: primary pre-market evidence from Rank 2–4 real-world clinical studies, reinforced by Rank 4 post-market observational evidence from routine clinical practice at 21 sites, corroborated by Rank 7 passive surveillance, and supported by Rank 8 professional-opinion data and Rank 11 simulated-use studies. Even if a clinical evaluator were to reclassify R-TF-015-012 from Rank 4 to Rank 8, the sufficiency conclusion is preserved because the study's quantitative findings converge with the Rank 2–4 prospective clinical investigations and the Rank 11 MRMC supporting evidence. This characterisation is now explicit in R-TF-015-003 and in the CEP evidence-hierarchy table of R-TF-015-001.

2. Coverage of indications: 3-tier evidence structure with 7-category epidemiological framework

We have updated the CER to justify coverage of all relevant indications using a risk-proportionate, 3-tier evidence structure and a 7-category epidemiological framework based on the Global Burden of Skin Disease (Karimkhani et al. 2017).

The 7 epidemiological categories and their evidence coverage are: infectious diseases (57% of global burden, 4 studies), other conditions (19%, 7 studies), inflammatory diseases (15%, 7 studies), malignant and pre-malignant neoplasms (5%, 7 studies), autoimmune diseases (3%, 2 studies), genodermatoses (1%, 0 studies), and vascular conditions (1%, 4 studies). The evidence coverage matrix, showing which studies cover which categories, is now documented in R-TF-015-003.

The 3-tier evidence structure assesses the clinical evidence at a depth proportionate to the clinical risk of misclassification:

Tier 1: Malignant conditions (individual analysis). Individual acceptance criteria per condition or condition group. MC_EVCDAO_2019 provides dedicated melanoma evidence (AUC 0.8482, Top-3 sensitivity 0.9032 for melanoma; AUC 0.8983 for overall malignancy detection). Six additional studies contribute malignancy data across melanoma, BCC, SCC, and actinic keratosis. This satisfies MEDDEV 2.7.1 Rev 4 Annex A7.3 (per-indication sensitivity/specificity for major clinical indications) and MDCG 2020-6 (no unsupported pooling for high-risk indications).
Tier 2: Rare diseases (grouped analysis). Grouped analysis with a dedicated acceptance criterion. BI_2024 and PH_2024 provide evidence for rare disease diagnosis: +26.77% improvement in Top-1 accuracy for conditions including GPP, acne conglobata, palmoplantar pustulosis, subcorneal pustular dermatosis, and pemphigus vulgaris. This satisfies MDCG 2020-6 § 6.5(e) by demonstrating specific evidence for a higher-risk subgroup.
Tier 3: General conditions (pooled with risk-based justification). Pooled analysis across non-malignant, non-rare categories. The risk-based justification for pooling, now documented in the CER, rests on four factors: (a) comparable clinical consequence of misclassification within these categories (delayed or modified treatment, not mortality, with the physician's clinical assessment providing the safety net); (b) the device outputs a single probability distribution across all ICD-11 categories simultaneously, so pooled assessment reflects how the device actually works; (c) representative sampling across epidemiological categories, demonstrated by the coverage matrix covering 5 of 7 categories (97% of presentations); and (d) consistent Vision Transformer architecture processes all inputs through the same feature extraction pipeline.

3. Declared acceptable gaps: MDCG 2020-6 § 6.5(e)

Per MDCG 2020-6 § 6.5(e), when evidence is insufficient for an indication, the gap must be declared acceptable with justification and addressed via PMCF. Two epidemiological categories are declared as acceptable evidence gaps in the CER (R-TF-015-003, "Need for more clinical evidence"):

Gap 4: Autoimmune diseases (3% of dermatological presentations): Only bullous pemphigoid (5 cases in DAO_O_2022) provides autoimmune-specific evidence not already counted in Tier 2. Gap declared acceptable because autoimmune conditions typically require serological confirmation beyond visual assessment, the device is a decision-support tool, and misranking does not carry acute mortality risk comparable to malignancy. Addressed by PMCF Activity D.1 (prospective surveillance, 50-case target, Top-3 accuracy >= 60%).
Gap 5: Genodermatoses (1% of presentations): Zero direct representation. Gap declared acceptable because these conditions are diagnosed through genetic testing and clinical history, extreme rarity makes prospective recruitment impractical, and the device's role is supportive. Addressed by PMCF Activity D.2 (passive surveillance, safety-trigger-based).

Coverage of the remaining 5 categories (97% of presentations) is demonstrated by the coverage matrix. The coverage is strong for other conditions, inflammatory diseases, and malignant neoplasms (7 studies each); moderate for infectious diseases (4 studies); and adequate for vascular conditions (4 studies, predominantly benign presentations).

4. Coverage of patient populations

The CER has been updated with a comprehensive demographic analysis. Study populations span paediatric (6.3%), adult (69.5%), and geriatric (19.6%) patients, with balanced gender distribution (45.4% male, 54.6% female), and Fitzpatrick skin phototypes I through V. Phototype V is present but in limited numbers; phototype VI has minimal representation. Both are identified as a monitoring priority in the PMCF plan. This underrepresentation is communicated to users in the IFU (Important Safety Information, "Population and performance variability"), which instructs clinicians to exercise particular judgment for underrepresented populations.

To strengthen the pre-market evidence for Fitzpatrick V–VI at CE-marking, a dedicated multireader multicase simulated-use study, MAN_2025, has been conducted. The study evaluates whether the device improves the Top-1 diagnostic accuracy of healthcare professionals on 149 clinical images representing Fitzpatrick V–VI presentations of multiple dermatological conditions, sourced from the same case pool as the three source MRMC studies (SAN_2024, BI_2024, PH_2024) so that results are directly comparable across phototype groups. Nineteen healthcare professionals spanning dermatology, primary care and nursing were enrolled; the primary analysis cohort comprises the sixteen readers who met the CIP §Inclusion criteria and substantially completed the study at the 17 April 2026 data lock. MAN_2025 is classified as Rank 11 (simulated use with healthcare professionals) per MDCG 2020-6 Appendix III, contributes to Pillar 3 Clinical Performance per MDCG 2020-1 §4.4 at Rank 11 (supporting Pillar 3 evidence; intended users achieving clinically relevant outputs on images representative of Fitzpatrick V–VI presentations of the intended patient population), and is referenced in the updated CER §Sufficiency determination. Full documentation is provided in the Clinical Investigation Plan (R-TF-015-004) and Clinical Investigation Report (R-TF-015-006) for MAN_2025.

5. Coverage of clinical benefits

The three consolidated clinical benefits are all supported by the evidence portfolio, with all acceptance criteria achieved. Each acceptance criterion was derived from published SotA performance benchmarks identified and appraised in R-TF-015-011 (State of the Art); the derivation methodology and specific SotA articles underpinning each threshold are documented in R-TF-015-003, "Acceptance Criteria Derivation from State of the Art":

Benefit 7GH: Diagnostic Accuracy (all presentations). Sub-criterion (a) general conditions: +18.5% aggregate Top-1 accuracy improvement (criterion >= 15%), 70 supporting claims. Sub-criterion (b) rare diseases: 54.8% aggregate Top-1 accuracy (criterion >= 54%), 24 claims. Sub-criterion (c) malignant lesions: AUC 0.97 (criterion >= 0.90), 20 claims. Supported by 7 studies spanning Ranks 2–4, 7–8, and 11. Routine-practice confirmation from the post-market observational study R-TF-015-012: across 21 clinical sites, the N = 56 analysis set reported a mean diagnostic-assessment change rate of 18.77% (co-primary endpoint B2; MCID 5%; Holm-adjusted p < 0.05), with supportive endpoints B4 (rare-disease identification, 7.30 cases per year) and B6 (malignancy detection, 14.68 cases per year) both exceeding their MCIDs.
Benefit 5RB: Objective Severity Assessment. The evidence for severity assessment is structured across two distinct evidence types per MDCG 2020-1. Technical Performance evidence (Pillar 2) is provided by 4 published peer-reviewed validation studies that demonstrate algorithm-level concordance with expert dermatologist consensus on internationally validated severity scales. Clinical Performance evidence (Pillar 3) is provided by the AIHS4_2025 study (2 patients, 16 longitudinal assessments, ICC 0.727), by the post-market observational study R-TF-015-012 (co-primary endpoint C4: 36.23 treatment decisions per year directly informed by the device's severity scores, MCID 10/yr, Holm-adjusted p < 0.05; supportive endpoint C5: longitudinal-monitoring rate 30.53%, MCID 5%), and will be substantially expanded through prospective PMCF activities. Per-condition acceptance criteria and results from the published Technical Performance literature are: Sub-criterion (a) HS severity (IHS4): ICC >= 0.70; observed ICC 0.727 from AIHS4_2025 (real-world clinical setting, limited sample). Sub-criterion (b) psoriasis severity (PASI): visual sign classification accuracy >= human annotator consensus; observed 60.6% vs 52.5% for erythema, from Mac Carthy et al., JEADV Clinical Practice, 2025 (2,857 images, 4 expert dermatologists). Sub-criterion (c) urticaria severity (UAS): Krippendorff alpha >= 0.60; observed alpha 0.826 for hive counting, from Mac Carthy et al., JID Innovations, 2024 (313 images, 5 dermatologists). Sub-criterion (d) AD severity (SCORAD): RMAE <= 15%; observed RMAE 13.0%, from Medela et al., JID Innovations, 2022. An additional published validation (Hernández Montilla et al., Skin Research and Technology, 2023; 221 images, 6 dermatologists) demonstrates HS severity assessment comparable to the most expert physician. This Technical Performance evidence establishes algorithm-level validity across 4 conditions using retrospective atlas-based datasets. Prospective validation with device-captured clinical images is a primary objective of PMCF activities B.1–B.5 (target: 100+ patients per condition), which will provide essential Clinical Performance evidence confirming real-world severity scoring performance. Clinical investigations contributing severity data: AIHS4_2025, COVIDX_EVCDAO_2022, IDEI_2023.
Benefit 3KX: Care Pathway Optimisation. Sub-criterion (a) waiting times: 56% reduction (criterion >= 50%), 14 claims. Sub-criterion (b) referral adequacy: 38% reduction in unnecessary referrals (criterion >= 30%), 8 claims. Sub-criterion (c) remote care: +30% improvement in referral sensitivity and 100% expert consensus (criterion >= 75%), 5 claims. Supported by 5 studies. Routine-practice confirmation from the post-market observational study R-TF-015-012: co-primary endpoint D4 (referral-adequacy improvement) observed at 15.56% (MCID 5%; Holm-adjusted p < 0.05), which aligns with the published SotA range of 14–24% reduction in unnecessary referrals for medical-device-assisted triage; supportive endpoints D2 (waiting-time reduction 14.53%), D6 (remote-assessment adequacy 47.76%) and D7 (remote-volume increase 24.64%) all exceed their MCIDs.

6. Pillar 1 Valid Clinical Association — surrogate-endpoint validity anchoring

The device is a Class IIb medical device software under MDR Rule 11, and clinical benefit is demonstrated through performance-based endpoints (diagnostic accuracy, sensitivity, specificity, AUC) and workflow-related endpoints (referral appropriateness, waiting-time reduction, severity-score-driven treatment decisions) rather than through direct patient-outcome endpoints. To strengthen Pillar 1 Valid Clinical Association evidence for this indirect-benefit demonstration, the State of the Art review R-TF-015-011 has been extended with a targeted anchoring section titled "Surrogate endpoint validity". This section independently establishes, from peer-reviewed dermatology literature and regulator-accepted clinical-endpoint history, that the three surrogate-endpoint families underlying the declared clinical benefits are accepted proxies for patient-relevant outcomes: (i) diagnostic accuracy (benefit 7GH) is anchored to melanoma-specific survival via the AJCC 8th-edition staging evidence and to overall survival via the National Cancer Database time-to-surgery cohort analysis; (ii) severity scoring (benefit 5RB) is a regulator-accepted primary endpoint under the EMA CHMP/EWP/2454/02 guideline for psoriasis, the HOME international consensus for atopic eczema, and the NAAF Severity of Alopecia Tool for alopecia areata, with quantitative severity-score-to-quality-of-life linkage quantified across thirteen biologic RCTs; (iii) referral optimisation and care-pathway metrics (benefit 3KX) are anchored by RCT-level outcome equivalence of teledermatology vs. in-person care and by large real-world waiting-time reduction with preserved diagnostic fidelity. Thirteen targeted anchoring references have been added to R-TF-015-011 with CRIT1-7 appraisal applied using the same methodology as the primary and first supplementary search corpora. The Clinical Evaluation Report R-TF-015-003 has been updated with a new subsection titled "Causal pathway and clinical meaningfulness of the selected endpoints" that articulates, per benefit, the causal pathway from device output to patient-relevant outcome and positions each surrogate as clinically meaningful in the peer-reviewed literature and in regulator-accepted clinical-endpoint history. The CEP R-TF-015-001 evidence-hierarchy planned source for Pillar 1 VCA has been updated to reference the new anchoring section. Together, the primary SotA literature (performance benchmarks), the April 2026 thematic supplementary search (coverage and subgroup evidence) and this anchoring layer (surrogate-to-outcome validity) close MDCG 2020-1 Pillar 1 for each of the three declared clinical benefits.

7. Safety sufficiency

Safety is confirmed by three independent data sources. First, zero serious adverse events or device-related complications across all nine pre-market clinical investigations (800+ patients). At zero observed events, the upper 95% confidence bound for the true adverse event rate is 0.375% (rule of three per MEDDEV 2.7.1 Rev 4 Annex A7.4). Second, the equivalent legacy device's market experience over approximately four years and 21 contracts (approximately 250,000 clinical reports) is consolidated in the legacy umbrella PMS Report R-TF-007-003 prepared under MDR Article 85 (the Report paired with the legacy umbrella PMS Plan R-TF-007-005), recording zero Article 87 serious incidents, zero Article 88 trend reports, zero FSCAs and zero product recalls; three customer-reported events (one clinical-output accuracy feedback and two API availability events) and two non-safety complaints, all closed, with no patient harm reported in any case. This independent real-world safety confirmation is appraised under MDCG 2020-6 § 6.2.2. Third, the post-market observational study R-TF-015-012 incorporates a proactive safety-surveillance section (items F1–F4) administered alongside the benefit questions, consistent with MDR Article 83(1). F3 (physician-reported overall perceived safety) returned a mean of 4.14 on a 5-point Likert scale, indicating strong physician confidence in the device's safety in routine use. F1 (physician observation of any case where device output was misleading) sits at 26.8% (15 / 56) in the analysis set, below the pre-specified 30% follow-up threshold, so the protocol-specified F1 follow-up is not triggered; the 15 substantiated F1 = Yes responses were nonetheless reviewed thematically in R-TF-007-003 § 4.7.5 and are consistent with the device's intended-use architecture (the device is designed, labelled and deployed as a clinical decision-support tool whose outputs are interpreted by a supervising healthcare professional, with this use condition specified as a manufacturer-mandated integration requirement in the Instructions for Use). Prior to the application of the protocol's Section 10.7 evidence-quality substantiation principle — which excluded four responses as unsubstantiated safety flags (see the companion study report, "Data-quality exclusions") — the F1 proportion was 19 / 60 (31.7%), marginally above the threshold; the drop from 31.7% to 26.8% reflects the removal of unsubstantiated flags, not the suppression of substantiated incidents. The F4 responses (4 / 56 reporting formal adverse-event logging to their institution's vigilance system) have been cross-referenced against the R-006-002 registry, and no unreported serious incident was identified.

8. Lifetime sufficiency through PMCF

The sufficiency argument extends beyond pre-market data. The PMCF plan (R-TF-007-002) describes a continuous evaluation lifecycle with 11 targeted clinical activities (A.1 through D.2) and 4 general surveillance methods. These are explicitly mapped to all three clinical benefits, all identified risks with clinical impact, and both declared evidence gaps. The annual PMCF evaluation cycle feeds directly into CER updates per GP-015, ensuring continuous reassessment of the benefit-risk profile throughout the device's market life as required by MDR Article 61(11) and Annex XIV Part B.

For further details, refer to the updated R-TF-015-003 (sections: "Tiered evidence assessment strategy", "Evidence coverage by disease category", "Need for more clinical evidence", "Post-market clinical study of the equivalent legacy device (R-TF-015-012)", "Justification of sufficiency of clinical evidence"), to the companion documents of the post-market observational study of the equivalent legacy device (study Protocol R-TF-015-012 and its Study Report, both nested inside the legacy umbrella PMS Plan R-TF-007-005; results consolidated in the legacy umbrella PMS Report R-TF-007-003 prepared under MDR Article 85) all held in the technical file, and to R-TF-015-001 (section 11 evidence hierarchy, updated to mark Rank 4, Rank 7 and Rank 8 as used, with the specific evidence sources identified). Red-lined documentation is provided.