Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
  • Legit.Health Plus Version 1.1.0.1
  • Legit.Health Utilities
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • BSI Non-Conformities
    • Technical Review
    • Clinical Review
      • Round 1
        • Item 1: CER Update Frequency
        • Item 2: Device Description & Claims
        • Item 3: Clinical Data
          • Request A: Clinical Data Analysis
          • Request B: Data Sufficiency Justification
            • Question
            • Research and planning
        • Item 4: Usability
        • Item 5: PMS Plan
        • Item 6: PMCF Plan
        • Item 7: Risk
    • BSI Non-Conformities
  • Pricing
  • Public tenders
  • BSI Non-Conformities
  • Clinical Review
  • Round 1
  • Item 3: Clinical Data
  • Request B: Data Sufficiency Justification
  • Research and planning

Research and planning

Internal working document

This document is for internal use only. It contains analysis, gap identification, and response strategy for Item 3b of the BSI Clinical Review Round 1. It will not be included in the final response to BSI.

1. What BSI is asking​

Item 3b says: "Please provide justification that sufficient data in quantity and quality has been analyzed in order to support the clinical benefit, safety, and performance of the device as compared to SotA in its intended use, including for all of the relevant patient populations and indications."

Where Item 3a asks us to identify and analyse all clinical data, Item 3b asks us to justify that the data is sufficient. This is a distinct regulatory obligation under:

  • Annex XIV (2): The clinical evaluation shall be "thorough and objective" and its "depth and extent shall be proportionate and appropriate to the nature, classification, intended purpose and risks of the device."
  • Article 61(1): The manufacturer shall "specify and justify the level of clinical evidence necessary."
  • MDCG 2020-6 Appendix III: Hierarchy of clinical evidence types and considerations for sufficiency.

The word "sufficient" has three dimensions BSI will evaluate:

  1. Quantity: Enough subjects, enough studies, enough statistical power
  2. Quality: Study designs, methodological rigour, appraisal scores
  3. Coverage: All clinical benefits, all indications, all relevant patient populations, all intended users, safety endpoints

2. Current state of the sufficiency argument in the CER​

What the CER claims (R-TF-015-003)​

Line 840: "The adequacy of the number of observations, gathered from over 800 patients across eight pivotal studies, is justified for both performance and safety. Regarding performance, the sample size was formally calculated to ensure sufficient statistical power to validate the primary performance endpoints, based on detecting an effect size exceeding the 80% performance goal..."

Line 873: "The current body of evidence is sufficient to demonstrate the conformity of Legit.Health Plus with the General Safety and Performance Requirements (GSPRs) of the MDR 2017/745."

Why BSI finds this insufficient​

The CER makes a top-level sufficiency claim but does not provide the structured, granular justification BSI expects. Specifically:

  1. No mapping from each clinical benefit → supporting studies → evidence adequacy
  2. No mapping from each indication → coverage across studies → gaps identified
  3. No patient population breakdown showing demographic representativeness
  4. No explicit comparison of evidence strength vs SotA for each claim
  5. The MDCG 2020-6 evidence hierarchy table in the CEP (lines 692–710) marks "No" for Rank 5 (equivalence data), Rank 7 (complaints/vigilance), and Rank 8 (PMS data) — despite claiming equivalence and having PMS data available. This directly contradicts the CER's own narrative.

3. Inventory of clinical evidence​

3.1. Study portfolio​

StudyDesignNPopulationIndicationsKey domainsUser group
AIHS4 2025Retrospective, longitudinal2 patients (16 assessments)HS patientsHidradenitis suppurativaSeverity assessmentDermatologists
BI 2024Prospective, cross-sectional100 images, 15 practitionersMixed conditionsGPP, HS, multipleDiagnostic accuracy, rare diseasesPCPs + dermatologists
COVIDX 2022Prospective, cross-sectional160 patients, 6 dermatologistsChronic dermatological conditionsMultiple chronic conditionsClinical utility, remote monitoring, severity assessmentDermatologists
DAO_O 2022Prospective, longitudinal117 patients (127 enrolled, 10 excluded)Primary care referralsMultiple conditionsReferral adequacy, malignancy detectionPCPs
DAO_PH 2022Prospective, longitudinal131 patientsPrimary care referralsMultiple conditionsDiagnostic accuracy, referral adequacyPCPs + dermatologists
IDEI 2023Prospective + retrospective202 patientsPigmented lesions + alopeciaMelanoma suspicion, androgenetic alopeciaDiagnostic accuracy, malignancy detection, severity assessmentDermatologists
MC_EVCDAO 2019Prospective, cross-sectional105 patientsMelanoma-suspected lesionsMelanomaMalignancy detectionDermatologists
PH 2024Prospective, cross-sectional30 images, 9 PCPsMultiple conditionsMultiple conditionsDiagnostic accuracy, remote consultationPCPs
SAN 2024Prospective, cross-sectional29 images, 16 practitionersMultiple conditionsMultiple conditionsDiagnostic accuracy, remote consultationPCPs + dermatologists

Total: 9 studies (8 with frozen MDR version + 1 with legacy device), 800+ patients, 60+ practitioners.

3.2. Evidence hierarchy assessment (MDCG 2020-6)​

The CEP's evidence hierarchy table (lines 692–710) needs correction. Current vs. what we actually have:

RankEvidence typeCEP saysActual status
1High quality CIs covering all variantsYesYes — 8 pivotal studies
5Equivalence dataNoShould be Yes — equivalence claimed with legacy device, full access to design data
6SotA evaluationYesYes — 64 articles in R-TF-015-011
7Complaints and vigilance dataNoShould be Yes — 7 non-serious incidents documented in PSUR/PMS Report
8Proactive PMS data (surveys)NoShould be Yes — COVIDX included CUS/DUQ/SUS questionnaires; PMCF surveys conducted
Critical inconsistency

The CEP explicitly marks equivalence data, vigilance data, and PMS survey data as "Not used" in the evidence hierarchy, while the CER simultaneously claims equivalence with the legacy device and references its market experience. This inconsistency must be corrected in both documents.

3.3. Appraisal quality scores (CER lines 722–734)​

StudyRelevance (/6)Quality (/4)Weight (/10)Level of evidence (/10)
MC_EVCDAO 20190.53.56.55
AIHS4 20250.53.58.55
BI 20240.53.58.56
COVIDX 20220.52.56.55
DAO_O 20220.53.59.55
DAO_PH 20220.53.59.55
IDEI 20230.53.58.55
PH 20240.53.58.55
SAN 20240.53.58.55
Mean——8.35.1

Mean weight 8.3/10 is strong. Level of evidence 5/10 reflects primarily observational designs (no RCTs), which is standard for SaMD diagnostic aids.

4. Coverage analysis​

4.1. Clinical benefit coverage​

Mapping the 7 claimed clinical benefits to supporting studies:

BenefitCodeSupporting studiesCoverage assessment
Diagnostic accuracy for multiple conditions7GHBI 2024, DAO_PH 2022, IDEI 2023, SAN 2024, PH 2024Strong — 5 studies, multiple user groups, 500+ subjects
Reduce waiting times3KXDAO_O 2022, DAO_PH 2022, COVIDX 2022Moderate — 3 studies; operational impact (actual waiting time reduction) not directly measured, inferred from referral adequacy
Referral precision8PLDAO_O 2022, DAO_PH 2022Moderate — 2 studies with 248 patients in primary care settings
Malignancy detection (skin cancer)1QFMC_EVCDAO 2019, IDEI 2023, DAO_O 2022, DAO_PH 2022Strong — 4 studies, 555+ patients, includes melanoma-specific cohort
Rare disease diagnosis9VWBI 2024, SAN 2024, PH 2024Moderate — acceptance criteria defined as "improvement in rare conditions"; coverage depends on how "rare" is defined across study populations
Objective severity assessment5RBAIHS4 2025, COVIDX 2022, IDEI 2023Weak-to-moderate — AIHS4 has only 2 patients (16 assessments); COVIDX uses CUS rather than direct severity measurement; IDEI covers androgenetic alopecia severity. Gap identified in CER (Gap 2) for atopic dermatitis, acne, and FFA
Remote care0ZCCOVIDX 2022, PH 2024, SAN 2024Moderate — COVIDX was conducted remotely; PH/SAN assessed remote consultation feasibility

Key weakness: Benefit 5RB (severity assessment) has the thinnest evidence base. AIHS4 with 2 patients is extremely small, and the CER itself acknowledges this as Gap 2 for PMCF.

4.2. Indication coverage​

The device covers ICD-11 Chapter 14 skin conditions. Key condition groups and their study coverage:

Condition categoryStudies providing evidenceN (approx.)Assessment
Melanoma / malignant lesionsMC_EVCDAO, IDEI, DAO_O, DAO_PH400+Good
Pigmented lesions (benign)MC_EVCDAO, IDEI200+Good
PsoriasisCOVIDXPart of 160Limited — single study
AcneCOVIDXPart of 160Limited — single study; Gap 2
Atopic dermatitisCOVIDXPart of 160Limited — single study; Gap 2
Hidradenitis suppurativaAIHS4, BI2 + imagesWeak — AIHS4 has 2 patients
GPP (Generalised Pustular Psoriasis)BIImage-basedLimited — single study, image assessment only
Androgenetic alopeciaIDEI96Moderate — single study but adequate N
UrticariaCOVIDX (PMS data)—Minimal — mentioned in usage patterns only
Other rare conditionsBI, SAN, PHImage setsVariable — depends on condition

Key weakness: The device claims coverage of all ICD-11 Chapter 14 conditions but most individual conditions (beyond melanoma and pigmented lesions) are covered by only 1–2 studies. The CER must either justify why limited per-condition coverage is acceptable (uniform algorithm architecture argument) or narrow the claims.

4.3. Patient population coverage​

Demographic factorAvailable dataGap
AgeStudies specify "adult patients (≥18)" but no age distribution breakdown providedNeed to compile available age ranges from study data
SexNot reported per studyGDPR data minimisation limits collection; must be justified
Fitzpatrick skin typeSome studies have data (confirmed by user) — need to identify which and compileCritical for AI dermatology — must present whatever data exists
Geographic diversityStudies conducted in Spain (Basque Country, Madrid, other regions)Limited geographic diversity; must justify representativeness
ComorbiditiesNot systematically reportedStandard for SaMD observational studies; justify

4.4. User group coverage​

User groupStudiesN practitionersAssessment
Primary care physicians (PCPs)DAO_O, DAO_PH, BI, PH, SAN30+Good
DermatologistsMC_EVCDAO, IDEI, COVIDX, BI, SAN30+Good
IT professionals (deployment)None0Not applicable — IT professionals deploy the device, they don't generate clinical data

4.5. Safety coverage​

Safety aspectEvidenceAssessment
Adverse events in CIs0 across all 9 studiesStrong — consistent "no adverse events" across 800+ patients
Device deficiencies in CIs0 reportedStrong
Legacy market experience7 non-serious incidents, 0 serious, 0 FSCAs (4+ years, 4,500+ reports)Strong — but NOT included in CER (see Item 3a)
Vigilance database searchEUDAMED/MAUDE searches referencedNeed to confirm this is documented
Similar device safetySotA identified no direct patient harm from similar devicesAdequate

5. Gap analysis specific to sufficiency​

#Sufficiency dimensionWhat we haveWhat's missing for BSIPriority
1Benefit-to-study mapping7 benefits, 9 studies — mapping is implicit in filter criteria codeExplicit narrative in CER mapping each benefit to its supporting studies, with per-benefit sufficiency conclusionHigh
2Indication coverage justificationStudies cover melanoma, pigmented lesions, multiple chronic conditions, HS, GPP, alopeciaExplanation of how 9 studies covering ~15 conditions justify claims across all ICD-11 Ch.14 (~346 conditions). The uniform algorithm architecture argument needs to be made explicitHigh
3Population demographics"Over 800 patients" — no demographic breakdownCompile Fitzpatrick data from studies that have it; present available age/sex data; justify gaps via GDPR and study designHigh
4Per-study sample size justificationFormal calculations exist in CIPs (80% power, alpha 0.05 for IDEI; melanoma ratio for MC_DAO; target sample for others)CER must summarise the sample size rationale for each study, not just claim "over 800 patients"Medium
5Evidence hierarchy correctionCEP table marks equivalence, vigilance, and PMS as "Not used"Correct the table to reflect actual data used; align with CER narrativeHigh
6Quality methodologyStudies appraised with mean weight 8.3/10CER needs a brief discussion of why observational Level 5 evidence is appropriate for SaMD (no surgical intervention, no randomisation needed for diagnostic accuracy studies)Medium
7SotA comparison narrativeacceptanceCriteriaStateOfTheArtValue exists per claimCER lacks an explicit "device vs SotA" comparison section with aggregate conclusions. Individual claim-level comparisons exist but no synthesisHigh
8Severity assessment evidence weaknessAIHS4 has 2 patients; acknowledged as Gap 2Must explicitly acknowledge this limitation and justify that PMCF activities will address it; argue that current evidence is sufficient for initial CE mark with planned post-market data collectionMedium

6. Cross-NC connections​

Item 3a — Clinical data analysis​

Item 3a research covers the factual gaps (missing PMS data, CI regulatory details, acceptance criteria reconciliation, etc.). Item 3b builds on those findings to construct the sufficiency argument. The fixes are coordinated:

  • Item 3a Fix 1 (integrate PMS data) → feeds into Item 3b's safety sufficiency argument
  • Item 3a Fix 3 (acceptance criteria reconciliation) → feeds into Item 3b's performance sufficiency argument
  • Item 3a Fix 4 (data pooling methodology) → feeds into Item 3b's quantity justification

Item 2b — Clinical benefits, performance, safety vs SotA​

Item 2b research addresses the SotA traceability chain. Item 3b's gap #7 (SotA comparison narrative) depends on the same fix: establishing provenance from SotA articles → baselines → acceptance criteria → achieved values.

Technical Review M1.Q1 — IFU performance claims​

M1.Q1 research shares the concern about whether all IFU claims are backed by sufficient evidence, and the 239 vs 346 ICD-11 category reconciliation.

7. Response strategy​

Structure of the sufficiency justification​

The response should present a structured sufficiency argument organised along the three dimensions BSI expects:

A. Quantity of data​

  1. Total evidence base: 9 studies, 800+ patients, 60+ practitioners, 4+ years of market experience with legacy device
  2. Per-study sample size: Summarise each study's sample size calculation, target, and actual enrollment
  3. Per-benefit evidence: Map each of the 7 clinical benefits to supporting studies and total subjects contributing
  4. Statistical power: All studies designed for ≥80% power at alpha 0.05 (except AIHS4, which uses repeated measures design)

B. Quality of data​

  1. Study designs: All prospective or mixed prospective/retrospective; observational designs appropriate for SaMD diagnostic accuracy (no surgical intervention; reference standard available)
  2. Appraisal scores: Mean weight 8.3/10 across the portfolio; no study below 6.5/10
  3. Level of evidence: Level 5 (observational) is appropriate for SaMD — cite MDCG 2020-1 (clinical evaluation of MDSW) which acknowledges that RCTs may not be appropriate or feasible for SaMD
  4. Data quality controls: DIQA algorithm validates image quality in real-time; this mirrors real-world use because the device itself rejects poor quality images

C. Coverage​

  1. Clinical benefits: Table mapping 7 benefits → studies → subjects → sufficiency conclusion
  2. Indications: Justify coverage through the uniform algorithm architecture argument — the device processes all skin images through the same pipeline; condition-specific performance is validated for the highest-risk conditions (melanoma, malignant lesions) and representative chronic conditions; full ICD-11 coverage is monitored through PMCF
  3. Patient populations: Present available demographic data (Fitzpatrick from studies that have it; age ranges; geographic distribution); justify gaps via GDPR data minimisation and argue that skin condition diagnosis is less demographically sensitive than pharmacological interventions
  4. User groups: PCPs and dermatologists both well-represented across multiple studies
  5. Safety: Zero adverse events across 800+ patients in CIs + zero serious incidents across 4,500+ reports in market use; discuss why this is sufficient given the device's risk profile (SaMD, human-in-the-loop, no direct patient contact)
  6. Comparison to SotA: Device performance meets or exceeds SotA baselines derived from 64 articles; present the comparison at the clinical benefit level, not just the individual claim level

Fixes required in the CER​

Fix 1: Add a "Sufficiency of clinical evidence" section​

New section in the CER containing:

  • The benefit-to-study mapping table
  • The indication coverage analysis with justification
  • Available demographic data and gap justification
  • Per-study sample size summary
  • Aggregate safety conclusion incorporating both CI data and legacy PMS data

Fix 2: Correct the MDCG 2020-6 evidence hierarchy table​

In the CEP (R-TF-015-001, lines 692–710):

  • Change Rank 5 (equivalence) from "No" to "Yes" — reference the equivalence assessment
  • Change Rank 7 (complaints/vigilance) from "No" to "Yes" — reference PSUR/PMS Report
  • Change Rank 8 (proactive PMS/surveys) from "No" to "Yes" — reference COVIDX CUS/SUS and PMCF surveys

Fix 3: Add device vs SotA synthesis​

The CER currently presents individual performance claims with SotA values but no synthesis. Add a section that:

  • Groups performance by clinical benefit
  • Compares aggregate device performance vs SotA baselines
  • Draws per-benefit conclusions on whether the device meets, exceeds, or falls below SotA
  • Acknowledges limitations and how PMCF addresses them

Fix 4: Acknowledge and justify evidence limitations​

Proactively address known weaknesses:

  • AIHS4 small sample (2 patients) — justified by repeated measures design; Gap 2 in PMCF
  • Limited per-condition coverage beyond melanoma — justified by uniform architecture; PMCF monitoring
  • Limited geographic diversity (Spain only) — justified by skin condition universality; planned international PMCF
  • Observational designs only — justified by MDCG 2020-1 guidance on MDSW evidence requirements

8. Risk assessment​

RiskImpactMitigation
BSI concludes evidence is insufficient for all claimed indicationsCould require narrowing claims to only validated conditions, which would impact IFU and intended purposePresent the uniform architecture argument clearly; show that high-risk conditions (melanoma) have strongest coverage; acknowledge monitoring gaps addressed by PMCF
AIHS4's 2-patient study undermines severity assessment benefitBSI may require additional pre-market data for severity claimsFrame as "initial validation with confirmatory PMCF" per MDCG 2020-7; emphasise that COVIDX provides additional severity data for chronic conditions
Demographic coverage gaps (no age/sex breakdown) undermine population claimBSI may question whether results generalise across demographicsCompile Fitzpatrick data from studies that have it; present geographic diversity of study sites; cite GDPR data minimisation as legitimate constraint
Evidence hierarchy inconsistency triggers a secondary findingCould generate a new NC about CEP qualityFix the table proactively in both CEP and CER before responding

9. Open items​

Most open items for Item 3b are the same as Item 3a (see question-for-jordi.mdx). One additional item:

  1. Which studies have Fitzpatrick data? — User confirmed some studies have Fitzpatrick skin type data. Need to identify which ones and compile the data for the population coverage analysis. This may require reading each study's CIR in detail.
Previous
Question
Next
Question
  • 1. What BSI is asking
  • 2. Current state of the sufficiency argument in the CER
    • What the CER claims (R-TF-015-003)
    • Why BSI finds this insufficient
  • 3. Inventory of clinical evidence
    • 3.1. Study portfolio
    • 3.2. Evidence hierarchy assessment (MDCG 2020-6)
    • 3.3. Appraisal quality scores (CER lines 722–734)
  • 4. Coverage analysis
    • 4.1. Clinical benefit coverage
    • 4.2. Indication coverage
    • 4.3. Patient population coverage
    • 4.4. User group coverage
    • 4.5. Safety coverage
  • 5. Gap analysis specific to sufficiency
  • 6. Cross-NC connections
    • Item 3a — Clinical data analysis
    • Item 2b — Clinical benefits, performance, safety vs SotA
    • Technical Review M1.Q1 — IFU performance claims
  • 7. Response strategy
    • Structure of the sufficiency justification
      • A. Quantity of data
      • B. Quality of data
      • C. Coverage
    • Fixes required in the CER
      • Fix 1: Add a "Sufficiency of clinical evidence" section
      • Fix 2: Correct the MDCG 2020-6 evidence hierarchy table
      • Fix 3: Add device vs SotA synthesis
      • Fix 4: Acknowledge and justify evidence limitations
  • 8. Risk assessment
  • 9. Open items
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI Labs Group S.L.)