Research and planning
This document is for internal use only. It contains analysis, gap identification, and response strategy for Item 3a of the BSI Clinical Review Round 1. It will not be included in the final response to BSI.
1. What BSI is asking
Item 3a says: "Please address all points above. Please ensure all relevant clinical data is identified and provide sufficient analysis (including traceability, details, discussion and justifications)."
The "points above" are the detailed observations in the Item 3 index, which span five areas. BSI is essentially saying: the CER does not provide a standalone, self-contained clinical evaluation — the reviewer could not follow the analysis, verify traceability, or confirm that all clinical data has been adequately assessed. This is a deficiency finding under Annex XIV and Article 61.
The core regulatory concern is Article 61(1): the manufacturer must specify and justify the level of clinical evidence necessary to demonstrate conformity with the relevant GSPRs, and that level must be appropriate to the device's characteristics and intended purpose.
BSI's five observation areas (mapped to regulatory requirements)
| # | Area | Key concern | Regulatory basis |
|---|---|---|---|
| 1 | Overall analysis | Clinical benefits/performance docs hard to follow; links broken; data pooling unclear; unmet acceptance criteria not discussed | Annex XIV 1(a), 2 |
| 2 | Clinical investigations (MC_DAO, IDEI) | Missing regulatory details: competent authority communication, registration, publication status, protocol deviations | Annex XIV (b), (c) |
| 3 | Evidence sufficiency | Population coverage, traceability to outcomes, methodology justifications, sample sizes | Annex XIV 2, Article 61(1) |
| 4 | Equivalence | High-level assessment; unclear what changed since MDD; contradictory statements about "improvements" | Annex XIV 3 |
| 5 | Clinical literature | No subject-device articles found in literature; unclear if SotA protocol applies to device literature search | Annex XIV (e), Article 2(48) |
| 6 | PMS data | Legacy device marketed since 2020 with 21 contracts and 4,500 reports but no market data in CER | Article 2(48), (51) |
2. What BSI reviewed
- R-TF-015-003 Clinical Evaluation Report (CER)
- R-TF-015-001 Clinical Evaluation Plan (CEP)
- R-TF-015-011 State of the Art
- Clinical investigation reports (R-TF-015-006 series)
- R-TF-007-003 PSUR
- R-TF-007-004 PMS Report
- The "Clinical Benefits" and "Performance Claims" interactive components (rendered in the QMS site)
3. Relevant QMS documents and findings
3.1. CER — Commercialisation status (lines 427–441)
Line 429: "This product has not been commercialized yet. It is undergoing initial CE mark." Line 441: "The legacy device has been commercialized since 2020."
These two statements are not contradictory — they refer to different regulatory entities:
- "This product" = Legit.Health Plus (MDR version, not yet CE-marked under MDR)
- "The legacy device" = Legit.Health (MDD version, on market since 2020)
However, BSI reads the CER as a single document about a single device, so the distinction is confusing. The CER must make this clearer and, critically, must integrate the legacy market experience data into the clinical evaluation rather than treating it as separate.
3.2. CER — PMS data gap (lines 651–662)
Line 655–656: "Once on the market...the manufacturer will implement a proactive PMS process." (Future tense only.)
Line 660: "Since this clinical evaluation is performed for the initial CE-mark submission...there are currently no retrospective PMCF data."
This is the critical gap. The CER treats PMS as future-only, completely ignoring the legacy device's 4+ years of market experience. But the data exists:
-
R-TF-007-003 PSUR (lines 62–98): Documents 21 contracts, 4,500+ reports, 500+ practitioners, 1,000+ patients. Reports 7 non-serious incidents in 2023:
- 4 customer complaints (API deserialization, timeout, algorithm performance mismatch × 2)
- 3 internal non-conformities (image zoom bias, benign pigmentation scoring, misclassification)
- Zero serious incidents, zero FSCAs
-
R-TF-007-004 PMS Report (lines 69–98): Confirms zero serious incidents, zero FSCAs. Documents 6 customer complaints (4 classified as non-serious incidents), trend analysis, and CAPA actions.
Root cause: The CER was drafted as if the device had no market history, ignoring that equivalence is claimed with the legacy device. Under Article 2(48), clinical data includes "safety or performance information generated from PMS" — the legacy PMS data is clinical data that must be analysed.
3.3. Clinical investigations — Regulatory details
MC_EVCDAO_2019 (mc-evcdao-2019/r-tf-015-006.mdx)
| Detail | Status | Location |
|---|---|---|
| Ethics committee approval | Exists: CEIM approval February 10, 2020 | CIP r-tf-015-004.mdx, embedded approval PDFs |
| Competent authority (AEMPS) | Planned but not documented: CIP states CIR will be provided to AEMPS, but no record of actual communication | CIP line 191 |
| ClinicalTrials.gov registration | Not found anywhere in documentation | — |
| Publication status | Not found — no mention of whether the study was published in a journal | — |
| Protocol deviations | Partially documented: CIR states "no adverse events/deviations" but separately notes secondary objective (GP comparison) was abandoned due to recruitment difficulties | CIR line 417; CIP line 98 |
| Sample size discrepancy (200 → 105) | Documented: Originally 200 planned with 40 melanoma cases; study closed at 105 subjects with 36 melanoma (34.29%). Justification: exceeded melanoma ratio target; impact of low-quality data compensated by DIQA exclusion | CIR line 653 |
IDEI_2023 (idei-2023/r-tf-015-006.mdx)
| Detail | Status | Location |
|---|---|---|
| Ethics committee approval | Not explicitly documented in the CIR — compliance statement references regulatory adherence but no specific approval reference found | CIR line 55 |
| Competent authority | Not found | — |
| ClinicalTrials.gov registration | Partially found: CER line 740 mentions 2 records in ClinicalTrials.gov for IDEI_2023 and COVIDX_2022, but no NCT numbers provided | CER line 685, 740 |
| Publication status | Not found | — |
| Protocol deviations | Not found | — |
| Sample size | 202 patients recruited (108 pigmented lesions + 96 androgenetic alopecia) — appears adequate for stated objectives | CIR lines 142–147 |
3.4. Acceptance criteria — Met vs. not met
BSI says: "Some acceptance criteria have not been met and further analysis/justification has not been performed."
From the CER and study reports:
- MC_EVCDAO_2019: All primary acceptance criteria met (AUC 0.8482 > 0.8 threshold; sensitivity 0.7379 and specificity 0.8054)
- IDEI_2023: AUC 0.7338 (95% CI: [0.5971–0.8554]) for malignancy detection from retrospective images — this is below typical thresholds but within confidence interval range
- All 8 pivotal studies: CER line 956–966 states "all safety objectives...have been met"
Gap: The CER does not contain a per-study acceptance criteria reconciliation table showing which criteria were met and which were not, with justifications for any shortfalls. BSI needs this analysis presented explicitly. The <AcceptanceCriteriaTable> component renders this data but BSI noted the "links to CIs do not work" and the presentation is "difficult to follow."
The CER was delivered to BSI as a PDF export. The interactive React components (Clinical Benefits, Performance Claims, AcceptanceCriteriaTable) did not render interactively — links were broken and dynamic tables may have been incomplete or illegible. This is a confirmed document delivery problem that likely accounts for a significant portion of BSI's "difficult to follow" and "links do not work" observations.
Action required: All component-rendered data must be presented as static tables in the CER text, not delegated to React components, so the PDF export is self-contained and readable.
3.5. Data pooling and the globalValueOfDevice
BSI says: "It is unclear how/why data has been pooled or what the categories represent."
The globalValueOfDevice computation (weighted average: Σ(achievedValue × sampleSize) / Σ(sampleSize)) exists in packages/ui/src/components/PerformanceClaimsAndClinicalBenefits/types.ts but is not documented anywhere in the CER or CEP. The CER must explain:
- Why data is pooled (to derive a global performance estimate across heterogeneous study populations)
- How pooling is done (weighted average by sample size, grouped by indication + user group + domain + metric + magnitude + performance subject)
- Limitations of this approach (heterogeneity across study designs, populations, and settings)
This data pooling gap is also flagged in Item 2b as root cause #2: "Data pooling formula undocumented in regulatory documents." The fix must be coordinated — one description of the pooling methodology in the CER, referenced from both the clinical benefits analysis and the performance claims analysis.
3.6. Clinical literature search
BSI says: "§16.4.4 of the CER seems to state that there are no relevant articles identified on the subject device in the literature."
What the CER actually says (lines 740–744):
- 12 articles found from PubMed (10) and Google Scholar (2) about the device
- All 12 were excluded because they were "proprietary (internal) company articles describing preclinical (in-silico) and non-clinical results"
- ClinicalTrials.gov yielded 2 records (IDEI_2023 and COVIDX_2022) — these are already counted as pre-market CIs
BSI's concern: Excluding all 12 articles means the clinical literature search found zero usable articles about the device. BSI questions whether the SotA search protocol (with its PICO framework and appraisal methodology) was also applied to the device-specific literature search, or whether different methods were used.
Gap: The CER does not clearly distinguish:
- The SotA literature search (227 records → 64 included, about similar/alternative devices and clinical practice) — documented in R-TF-015-011
- The device-specific literature search (15 records → 0 included, about the subject device) — briefly mentioned in CER lines 678–744
The methodology, keywords, and appraisal criteria for the device-specific search should be explicitly described, and any differences from the SotA protocol should be justified.
3.7. Equivalence assessment
BSI says: "§16.24 of the CER seems to state that improvements have been made. It is unclear what these changes are."
CER line 563: "The improvements introduced in Legit.Health Plus — mainly related to software version stabilisation and the consolidation of features..."
CER line 433: "All differences between the two versions are solely documentary..."
These are contradictory: "solely documentary" vs "software version stabilisation and consolidation of features." BSI rightly flags this. The CER must either:
- Clarify that the changes are purely documentary (remove the "improvements" language), or
- List what specifically changed and justify that the changes do not impact clinical safety/performance
The equivalence tables (lines 498–557) use "Same" for almost every row, which BSI reads as superficial. More detail is needed on:
- What specific software changes were made (even if documentary)
- How these were assessed for impact on performance
- Reference to any design change records
3.8. Dermatoscopic camera concern
BSI says: "photos taken with dermatoscopic camera only (is this representative of how the device will be used)"
What the studies used:
- MC_EVCDAO_2019: DermLite Foto X dermatoscope with smartphones (Pixel 3, Galaxy S10, iPhone X)
- IDEI_2023: Mix of dermatoscopic (87.5% retrospective) and clinical (100% prospective) images
- Other studies (BI_2024, PH_2024, SAN_2024): Varied — some use clinical images only
Intended device use: The IFU states the device accepts clinical (non-dermoscopic) images. The device includes DIQA (Dermatology Image Quality Assessment) to validate image quality.
Gap: The CER does not explicitly address whether the validation studies are representative of real-world image acquisition. If most studies used dermatoscopic images but the device is intended for clinical images taken by non-specialist users, there is a representativeness concern. The CER needs a discussion of:
- Which studies used which image types
- How performance varies between dermatoscopic and clinical images
- Why the evidence is representative of intended use
3.9. Population coverage
BSI says: "How do the CIs sufficiently cover all/representative patient populations (age, pigment, sex, etc) and indications"
CER line 840: "Over 800 patients across eight pivotal studies."
Gap: The CER does not provide a demographic breakdown across studies. The PSUR notes GDPR-driven data minimisation (no demographic collection beyond clinical necessity), which is a legitimate regulatory constraint but creates a gap for BSI. The CER needs to:
- Describe what demographic data IS available from each study
- Justify any gaps in demographic coverage by reference to GDPR data minimisation principles
- Discuss Fitzpatrick skin type coverage (critical for AI dermatology devices)
- Address coverage of malignant/high-risk conditions specifically
4. Gap analysis
| # | BSI concern | What we have | What's missing | Severity |
|---|---|---|---|---|
| 1 | Clinical benefits hard to follow | React components with programmatic claim assignment | Narrative CER analysis explaining benefits, methodology, limitations; static-friendly tables | High |
| 2 | Performance claims hard to follow, broken links | 148 claims in performanceClaims.ts, dynamic tables | Same as above; PDF-exportable summary; broken link fix | High |
| 3 | Data pooling unclear | globalValueOfDevice computation in code | CER documentation of pooling methodology, justification, limitations | High |
| 4 | Unmet acceptance criteria not discussed | All criteria appear to be met (or borderline with CIs crossing thresholds) | Per-study reconciliation table with explicit met/not-met status and justifications | High |
| 5 | CI regulatory details (AEMPS, registration, publication, deviations) | Ethics approval exists for MC_DAO; IDEI has ClinicalTrials.gov entries; deviation documented in CIP | CER text explicitly stating: competent authority communications, NCT numbers, publication status, protocol deviations for each CI | High |
| 6 | CI methodology justifications | Sample size calculations in CIPs; DIQA quality validation | CER narrative on: photo quality removal rationale, dermatoscopic vs clinical images, MC_DAO 200→105 justification, sample size adequacy | High |
| 7 | Population coverage | 800+ patients across 8 studies; limited demographics due to GDPR | Demographic breakdown per study; Fitzpatrick coverage; malignant condition coverage analysis | Medium |
| 8 | Equivalence lacks detail | Equivalence tables exist; "Same" comparisons | Clarify "improvements" vs "documentary changes" contradiction; list specific changes; impact assessment | High |
| 9 | No subject-device literature | 12 articles found but excluded (preclinical/internal) | Explain why excluded articles are not clinical data; clarify protocol differences between SotA and device search | Medium |
| 10 | No PMS data in CER | PSUR and PMS Report document 7 non-serious incidents, 0 serious, 0 FSCAs | Integrate legacy PMS data into CER: complaints, incidents, trend analysis, safety conclusions | Critical |
5. Cross-NC connections
Clinical Review Item 2b — Clinical benefits, performance, safety vs SotA
Item 2b research identified overlapping gaps:
- SotA traceability:
acceptanceCriteriaStateOfTheArtValueexists in data but provenance chain to specific SotA articles is broken → same gap applies to Item 3a's "traceability to outcomes" concern - Data pooling documentation:
globalValueOfDeviceundocumented → same as Item 3a gap #3 - Top-1 accuracy: Some Top-1 metrics appear below SotA baselines → may be what BSI means by "some acceptance criteria have not been met"
- Use environment: remote care sub-criterion within benefit 3KX vs intended use environment → tangentially relevant to IDEI study's teledermatology context
Clinical Review Item 3b — Data sufficiency justification
Item 3b asks for justification that "sufficient data in quantity and quality has been analyzed." The research here directly feeds into 3b:
- Sample size adequacy (800+ patients, formal calculations)
- Population representativeness
- Data quality methodology (DIQA)
Technical Review M1.Q1 — IFU performance claims
M1.Q1 research has a cross-NC connection section identifying shared issues:
- SotA baselines traceability
- Top-1 accuracy vs IFU claims
- Data pooling documentation
- 239 vs 346 ICD-11 category reconciliation
6. Response strategy
Approach: Fix-then-reference
Per the BSI NC CLAUDE.md, the workflow is: analyse → fix the documentation → write the response referencing what was fixed.
Fixes required in the CER (R-TF-015-003)
Fix 1: Integrate legacy PMS data (Critical)
Add a new subsection under "Clinical data generated from risk management and PMS activities" that:
- Summarises the legacy device's market experience (2020–2024: 21 contracts, 4,500+ reports, 500+ practitioners, 1,000+ patients)
- Lists all non-serious incidents from the PSUR/PMS Report (7 incidents in 2023)
- Confirms zero serious incidents and zero FSCAs
- Analyses trends and CAPA outcomes
- Draws safety conclusions from the market data
- References R-TF-007-003 PSUR and R-TF-007-004 PMS Report
Fix 2: Add per-CI regulatory detail table
For each clinical investigation, add explicit documentation of:
- Competent authority communication (or statement that none was required for observational studies under national law)
- Clinical trial registration status and numbers
- Publication status
- Protocol deviations (or explicit "none" statement)
- Ethics committee approval reference
Fix 3: Add acceptance criteria reconciliation
Add a per-study table showing:
- Each acceptance criterion
- The achieved value
- Met/not-met status
- Justification for any borderline or unmet criteria
- This should be presented as static content (not React components) to ensure readability in PDF
Fix 4: Document data pooling methodology
Add a subsection to the CER explaining the globalValueOfDevice computation:
- Formula and grouping criteria
- Justification for pooling across studies
- Limitations (heterogeneity, design differences)
- Cross-reference to the CEP's study design rationale
Fix 5: Clarify equivalence "improvements" language
The changes between legacy and Plus are confirmed to be minor technical changes (software version stabilisation, feature consolidation) — not purely documentary. This means the CER's line 433 ("solely documentary") is inaccurate and must be corrected.
Action required:
- Remove the "solely documentary" claim from line 433
- Create a formal change list comparing legacy to Plus, with each change explicitly assessed for impact on clinical safety and performance (no such document currently exists — it needs to be created)
- Update the equivalence section to reference the change list and conclude that no change impacts clinical safety/performance
Fix 6: Expand clinical literature discussion
- Explain why the 12 excluded articles are preclinical/non-clinical and therefore not "clinical data" per Article 2(48)
- Clarify whether the device-specific search used the same protocol as the SotA search
- If different, document the differences and justify
Fix 7: Add methodology justification narrative
For each study, add discussion of:
- Why image quality exclusion is appropriate (DIQA mirrors real-world use because the device itself rejects poor quality images)
- Dermatoscopic vs clinical image coverage across the study portfolio
- MC_EVCDAO_2019 sample size rationale (exceeded melanoma ratio target at 105 subjects; statistical power maintained)
- Overall sample size adequacy across the 800+ patient portfolio
Fix 8: Improve population coverage narrative
- Compile available demographic data from each study
- Address Fitzpatrick skin type representation
- Address malignant/high-risk condition coverage (melanoma, SCC, BCC representation across studies)
- Justify any demographic gaps with reference to GDPR data minimisation and study design constraints
Fixes required elsewhere
Clinical Benefits and Performance Claims components
The interactive components need to be made BSI-reviewer-friendly:
- Fix broken links to clinical investigations
- Consider generating static summary tables that can be included in the CER as fallback
- Ensure the PDF export (if provided) renders the data correctly
A significant portion of BSI's Item 3 observations may stem from the CER being reviewed as a rendered web page or PDF where React components did not render correctly. This is a cross-cutting issue that affects Items 2a, 2b, 3a, and 3b. We need a strategy for presenting dynamic data to BSI in a format they can review.
7. Risk assessment
| Risk | Impact | Mitigation |
|---|---|---|
| PMS data gap is the most visible deficiency — BSI explicitly states "no discussion of data from the market...is found" | If not fixed, BSI will escalate: this is a direct violation of Article 2(48) (clinical data includes PMS) | Priority 1: integrate legacy PMS data into CER before responding |
| Equivalence "improvements" contradiction could undermine the entire equivalence claim | If BSI concludes the devices are NOT equivalent, all legacy clinical data becomes inapplicable | Clarify language immediately; remove "improvements" or provide detailed impact assessment |
| Interactive component rendering issues could make our response unintelligible to BSI | If BSI cannot read the clinical benefits/performance data, they will escalate regardless of content quality | Provide static summary tables alongside or instead of component references |
| MC_EVCDAO_2019 200→105 could be read as underpowered study | BSI may question whether the reduced sample size maintains statistical validity | Document that the melanoma ratio (34.29%) exceeded the target (20%), maintaining statistical power for the primary endpoint |
8. Addressed weaknesses (BSI auditor perspective)
Resolved
| # | Item | Resolution |
|---|---|---|
| 4 | Document delivery format | Confirmed: PDF export. React components did not render interactively. This is a confirmed root cause for "links don't work" and "hard to follow" observations. |
| 5 | Fitzpatrick data | Resolved: We collected the data from the different studies and added a table to the CER. |
| — | Equivalence: "improvements" vs "documentary" | Resolved: minor technical changes (not purely documentary). A formal change list with impact assessment was created and added to the CER. |
Pending — questions for Jordi
See question-for-jordi.mdx for the full list. Summary:
- Ethics Committee communications: Resolved: Added table to CER.
- ClinicalTrials.gov NCT numbers: Resolved: Added table to CER.
- Publication status: Resolved: Added table to CER.
- Fitzpatrick details: Resolved: Added table to CER.
Regulatory framework: what the BSI meeting revealed
Nick stated that refusal is extremely likely. Item 3 is the centrepiece of the clinical review: it is where BSI assesses whether sufficient clinical evidence exists to support certification. The combination of gaps identified — MRMC studies incorrectly framed as primary clinical evidence, PMS data absent from the CER, MEDDEV stage narration missing, and the "Level 1 and 2" claim that directly contradicts MDCG 2020-6 Appendix III — collectively represents the most serious gap in the entire clinical review. Every one of these must be resolved.
The four applicable guidance documents
| Document | Role for Item 3a |
|---|---|
| MEDDEV 2.7.1 Rev 4, Stages 1–4 (Sections 8–10) | The four MEDDEV stages are the framework that makes a clinical evaluation work. Nick stated explicitly: "Unless you complete stages 0, 1, 2, 3, 4, 5 — they won't work and it will fall down." The CER must narrate each stage visibly: Stage 1 = identification of all pertinent data (literature search + manufacturer-held data); Stage 2 = appraisal of each dataset with named validated tools; Stage 3 = analysis covering every indication, every user group, every population, full duration; Stage 4 = continuous updating. Currently the CER presents data without narrating these stages. A BSI reviewer following the MEDDEV checklist cannot complete the assessment because the stages are not visible. |
| MEDDEV 2.7.1 Rev 4, Section 9 (Stage 2 appraisal) | Each dataset must be individually appraised for: methodological quality (study design, sample size, power calculation, endpoints, controls, GCP compliance); relevance (pivotal vs. supporting); and weighting (defined criteria). Per MDCG 2020-6 § 6.3, validated appraisal tools must be named: Cochrane RCT tool, MINORS, Newcastle-Ottawa Scale, or IMDRF MDCE WG/N56 Appendix F. The current CRIT1-7 appraisal framework used in R-TF-015-011 must be mapped to one of these recognised tools — if it cannot be shown to be equivalent, it must be supplemented or replaced. Complaint/incident ratios alone are not sufficient to prove safety. |
| MDCG 2020-6, Appendix III | 12-level evidence quality hierarchy. Rank 11 = "simulated use / animal / cadaveric testing with HCPs" = NOT clinical data under MDR. Nick confirmed this position explicitly during the BSI meeting. The majority of our pivotal studies (BI_2024, PH_2024, SAN_2024 as MRMC image reviews) map to Rank 11. The CER executive summary's current claim of "high-quality clinical data (Level 1 and 2 according to the hierarchy of clinical evidence)" directly contradicts MDCG 2020-6 Appendix III. This claim must be corrected before submission — it is the kind of verifiable factual error that undermines the entire document's credibility with BSI. The correct per-study evidence ranking per Appendix III must replace it. |
| MDCG 2020-6, § 6.4 | "Sufficient clinical evidence must exist PRIOR to MDR certification. PMCF cannot fill pre-market gaps." The legacy device's 4+ years of market experience (21 contracts, 4,500+ reports, 7 non-serious incidents, 0 serious incidents, 0 FSCAs) is clinical data per MDR Article 2(48). It must be integrated into the CER as Stage 3 analysis — not delegated to the PSUR. The CER's current statement that "there are currently no retrospective PMCF data" (line 660) is factually incorrect — 4,500+ reports over 4 years is extensive PMS data. BSI flagged this explicitly. |
| MDCG 2020-1 | Three-pillar framework for MDSW. Each study in the clinical evidence portfolio must be explicitly mapped to the pillar(s) it supports: VCA (scientific association, from literature review), Technical Performance (algorithm accuracy in controlled conditions, from MRMC studies), or Clinical Performance (validated accuracy in real intended-use context, from real-world studies). MRMC studies contribute to Technical Performance but not Clinical Performance. The CER must present this pillar mapping explicitly — not just list studies. |
| MDCG 2020-13, Section D | BSI checks the literature review against Section D: search protocol, databases used, PICO/PRISMA methods, inclusion/exclusion criteria, both favourable AND unfavourable data, full documentation set (protocol + reports + retrieved list + excluded list with reasons + full-text copies). The device-specific literature search result (12 articles excluded) must be documented in full against Section D — not just mentioned in passing. |
The narration analogy: what Nick identified as the core problem
Nick and Erin both described the same fundamental issue from different angles: the clinical evaluation data exists, but is not narrated according to the recognised regulatory framework. Nick: "We're not even sure what we're seeing at this point because of the way it's being presented." Erin: "It's not clear if you make a statement how that statement is supported — and that traceability element is usually what is lacking."
The analogy identified by Taig in the internal debrief (2026-03-26): the software team had done all the development work correctly but had not framed it according to IEC 62304 design phases. The non-conformity was not about missing work — it was about missing narration. The fix was not to redo the software — it was to restructure how it was described. The same pattern applies to the clinical evaluation.
The primary fix for Item 3a is therefore narrative restructuring, not new data collection. The CER must be re-narrated according to MEDDEV 2.7.1 Rev 4 Stages 1–4, using the exact terminology of the standards, so that a reviewer can follow the chain: identified data → appraised data → analysed data → conclusions.
Correcting the "Level 1 and 2" evidence quality claim
The CER executive summary claims "high-quality clinical data (Level 1 and 2 according to the hierarchy of clinical evidence)." Per MDCG 2020-6 Appendix III (the only applicable hierarchy under MDR), the correct ranking for each study is:
| Study | Design | MDCG 2020-6 Appendix III rank |
|---|---|---|
| MC_EVCDAO_2019 | Prospective analytical observational, 105 patients | Rank 2–4 (methodological quality dependent) |
| IDEI_2023 | Prospective + retrospective, 202 patients | Rank 4 |
| COVIDX_2022 | Prospective observational, 160 patients | Rank 4–7 |
| DAO_O_2022 | Prospective longitudinal, 117 patients | Rank 4 |
| DAO_PH_2022 | Prospective longitudinal, 131 patients | Rank 4 |
| BI_2024 | MRMC image review, 15 HCPs × 100 images | Rank 11 (simulated use) |
| PH_2024 | MRMC image review, 9 PCPs × 30 images | Rank 11 (simulated use) |
| SAN_2024 | MRMC image review, 16 practitioners | Rank 11 (simulated use) |
| AIHS4_2025 | Retrospective longitudinal, 2 patients | Rank 4–7 (very small N) |
| Legacy PMS data | Vigilance/PMS, 4,500+ reports | Rank 7 |
This ranking must replace the "Level 1 and 2" claim in the CER. The claim must be rewritten as: "The clinical evidence portfolio includes studies at MDCG 2020-6 Appendix III Rank 2–4 (high-quality observational studies in real-world clinical settings) as primary evidence, supported by Rank 7 legacy market data and Rank 11 simulated-use studies as corroborating technical performance evidence."
X-3 disease categorisation: the framework for Stage 3 analysis
The X-3 decision (2026-03-28) provides the structure for how MEDDEV Stage 3 "every indication" coverage must be demonstrated. The three-tier structure:
Tier 1 (malignant conditions): Individual analysis per condition. MEDDEV A7.3 requires sensitivity/specificity for major clinical indications individually. Evidence: MC_EVCDAO_2019 (Rank 2–4, 105 patients, 36 melanoma, AUC 0.8482) and IDEI_2023 (Rank 4, 202 patients). These are the only studies that individually satisfy the "major clinical indication" requirement per MEDDEV A7.3 for high-risk conditions.
Tier 2 (rare diseases): Grouped analysis with risk-based justification. Evidence: BI_2024 (Rank 11 as MRMC, but corroborated by PH_2024 and SAN_2024). The acceptance criterion (54% absolute accuracy) must be traced to SotA baseline PCP accuracy for rare skin diseases (~30–40% unaided).
Tier 3 (general conditions, 97% of epidemiological coverage): Pooled analysis with the four-point risk-based justification from X-3. Real-world studies (COVIDX, DAO-O, DAO-PH) provide Rank 4 evidence for this tier.
Declared acceptable gaps: Autoimmune (3% prevalence, Gap A) and genodermatoses (1% prevalence, Gap B) — justified per MDCG 2020-6 § 6.5(e) and addressed by PMCF Activities D.1 and D.2 (specified in issue-6-pmcf.mdx).
This structure — three tiers of evidence with declared gaps — is what satisfies MEDDEV Stage 3's requirement to cover every indication. It must be presented explicitly in the CER, in the order of the evidence hierarchy: Tier 1 (strongest, most scrutinised), Tier 2 (grouped, justified), Tier 3 (pooled, justified), then gaps (declared acceptable, addressed by PMCF).
PMS data: the regulatory obligation per Article 2(48)
Per MDR Article 2(48), clinical data explicitly includes "safety or performance information generated from PMS." The legacy device's market experience is therefore not background information — it is clinical data that must be analysed in the CER under MEDDEV Stage 3. The analysis must include:
- Summary of market experience (2020–2024: 21 contracts, 4,500+ reports, 500+ practitioners)
- All non-serious incidents (7 in 2023): complaint types, investigation outcomes, CAPA actions
- Confirmation of zero serious incidents and zero FSCAs
- Safety conclusion drawn from this data: the absence of serious incidents over 4+ years of market use demonstrates that the severity 4 risk ceiling in the AI risk assessment is justified in practice
This analysis must appear as a standalone section in the CER, not merely as a reference to the PSUR. BSI reviews the CER in isolation — they will not automatically cross-reference the PSUR unless the CER explicitly integrates and summarises the PSUR's findings.