Research and planning

Internal working document

This document is for internal use only. It contains analysis, gap identification, and response strategy for Item 3a of the BSI Clinical Review Round 1. It will not be included in the final response to BSI.

1. What BSI is asking

Item 3a says: "Please address all points above. Please ensure all relevant clinical data is identified and provide sufficient analysis (including traceability, details, discussion and justifications)."

The "points above" are the detailed observations in the Item 3 index, which span five areas. BSI is essentially saying: the CER does not provide a standalone, self-contained clinical evaluation; the reviewer could not follow the analysis, verify traceability, or confirm that all clinical data has been adequately assessed. This is a deficiency finding under Annex XIV and Article 61.

The core regulatory concern is Article 61(1): the manufacturer must specify and justify the level of clinical evidence necessary to demonstrate conformity with the relevant GSPRs, and that level must be appropriate to the device's characteristics and intended purpose.

BSI's five observation areas (mapped to regulatory requirements)

#	Area	Key concern	Regulatory basis
1	Overall analysis	Clinical benefits/performance docs hard to follow; links broken; data pooling unclear; unmet acceptance criteria not discussed	Annex XIV 1(a), 2
2	Clinical investigations (MC_DAO, IDEI)	Missing regulatory details: competent authority communication, registration, publication status, protocol deviations	Annex XIV (b), (c)
3	Evidence sufficiency	Population coverage, traceability to outcomes, methodology justifications, sample sizes	Annex XIV 2, Article 61(1)
4	Equivalence	High-level assessment; unclear what changed since MDD; contradictory statements about "improvements"	Annex XIV 3
5	Clinical literature	No subject-device articles found in literature; unclear if SotA protocol applies to device literature search	Annex XIV (e), Article 2(48)
6	PMS data	Legacy device marketed since 2020 with 21 contracts and 250,000 reports but no market data in CER	Article 2(48), (51)

2. What BSI reviewed

R-TF-015-003 Clinical Evaluation Report (CER)
R-TF-015-001 Clinical Evaluation Plan (CEP)
R-TF-015-011 State of the Art
Clinical investigation reports (R-TF-015-006 series)
Legacy device PMS history as originally documented in the deprecated R-TF-007-003 PSUR and R-TF-007-004 PMS Report (both now superseded by the consolidated R-TF-007-003 Post-Market Surveillance Report in the legacy device folder)
The "Clinical Benefits" and "Performance Claims" interactive components (rendered in the QMS site)

3. Relevant QMS documents and findings

3.1. CER: Commercialisation status (lines 427–441)

Line 429: "This product has not been commercialized yet. It is undergoing initial CE mark." Line 441: "The legacy device has been commercialized since 2020."

These two statements are not contradictory; they refer to different regulatory entities:

"This product" = Legit.Health Plus (MDR version, not yet CE-marked under MDR)
"The legacy device" = Legit.Health (MDD version, on market since 2020)

However, BSI reads the CER as a single document about a single device, so the distinction is confusing. The CER must make this clearer and, critically, must integrate the legacy market experience data into the clinical evaluation rather than treating it as separate.

3.2. CER: PMS data gap (lines 651–662)

Line 655–656: "Once on the market...the manufacturer will implement a proactive PMS process." (Future tense only.)

Line 660: "Since this clinical evaluation is performed for the initial CE-mark submission...there are currently no retrospective PMCF data."

This is the critical gap. The CER treats PMS as future-only, completely ignoring the legacy device's 4+ years of market experience. But the data exists:

R-TF-007-003 Post-Market Surveillance Report (legacy device): Documents 21 contracts, approximately 250,000 reports, 500+ practitioners, 100,000+ patients. Reports:
- Zero Article 87 serious incidents, zero Article 88 trend reports, zero FSCAs across the full 2020–Q1 2026 surveillance period
- Three Category 3a customer-reported events in commercial use (one clinical-output accuracy feedback, two API availability events); all closed; no patient harm reported in any case
- Six Category 3b internal validation/testing findings (e.g. image zoom bias, benign pigmentation scoring, misclassification probes) — these are internal observations, not PMS signals in the vigilance sense, and are catalogued for completeness
- Two Category 4 non-safety complaints (JSON format, summative usability authentication friction)
This consolidated report supersedes the previously used R-TF-007-003 PSUR (deprecated — Class IIa/IIb format not applicable to the MDD Class I legacy device) and R-TF-007-004 PMS Report (deprecated — superseded by the new consolidated report).

Root cause: The CER was drafted as if the device had no market history, ignoring that equivalence is claimed with the legacy device. Under Article 2(48), clinical data includes "safety or performance information generated from PMS"; the legacy PMS data is clinical data that must be analysed.

3.3. Clinical investigations: Regulatory details

MC_EVCDAO_2019 (mc-evcdao-2019/r-tf-015-006.mdx)

Detail	Status	Location
Ethics committee approval	Exists: CEIM approval February 10, 2020	CIP r-tf-015-004.mdx, embedded approval PDFs
Competent authority (AEMPS)	Planned but not documented: CIP states CIR will be provided to AEMPS, but no record of actual communication	CIP line 191
ClinicalTrials.gov registration	Not found anywhere in documentation
Publication status	Not found: no mention of whether the study was published in a journal
Protocol deviations	Partially documented: CIR states "no adverse events/deviations" but separately notes secondary objective (GP comparison) was abandoned due to recruitment difficulties	CIR line 417; CIP line 98
Sample size discrepancy (200 → 105)	Documented: Originally 200 planned with 40 melanoma cases; study closed at 105 subjects with 36 melanoma (34.29%). Justification: exceeded melanoma ratio target; impact of low-quality data compensated by DIQA exclusion	CIR line 653

IDEI_2023 (idei-2023/r-tf-015-006.mdx)

Detail	Status	Location
Ethics committee approval	Not explicitly documented in the CIR; compliance statement references regulatory adherence but no specific approval reference found	CIR line 55
Competent authority	Not found
ClinicalTrials.gov registration	Partially found: CER line 740 mentions 2 records in ClinicalTrials.gov for IDEI_2023 and COVIDX_2022, but no NCT numbers provided	CER line 685, 740
Publication status	Not found
Protocol deviations	Not found
Sample size	202 patients recruited (108 pigmented lesions + 96 androgenetic alopecia), appears adequate for stated objectives	CIR lines 142–147

3.4. Acceptance criteria: Met vs. not met

BSI says: "Some acceptance criteria have not been met and further analysis/justification has not been performed."

From the CER and study reports:

MC_EVCDAO_2019: All primary acceptance criteria met (AUC 0.8482 > 0.8 threshold; sensitivity 0.7379 and specificity 0.8054)
IDEI_2023: AUC 0.7338 (95% CI: [0.5971–0.8554]) for malignancy detection from retrospective images; this is below typical thresholds but within confidence interval range
All 8 pivotal studies: CER line 956–966 states "all safety objectives...have been met"

Gap: The CER does not contain a per-study acceptance criteria reconciliation table showing which criteria were met and which were not, with justifications for any shortfalls. BSI needs this analysis presented explicitly. The <AcceptanceCriteriaTable> component renders this data but BSI noted the "links to CIs do not work" and the presentation is "difficult to follow."

Key issue: confirmed

The CER was delivered to BSI as a PDF export. The interactive React components (Clinical Benefits, Performance Claims, AcceptanceCriteriaTable) did not render interactively; links were broken and dynamic tables may have been incomplete or illegible. This is a confirmed document delivery problem that likely accounts for a significant portion of BSI's "difficult to follow" and "links do not work" observations.

Action required: All component-rendered data must be presented as static tables in the CER text, not delegated to React components, so the PDF export is self-contained and readable.

3.5. Data pooling and the `globalValueOfDevice`

BSI says: "It is unclear how/why data has been pooled or what the categories represent."

The globalValueOfDevice computation (weighted average: Σ(achievedValue × sampleSize) / Σ(sampleSize)) exists in packages/ui/src/components/PerformanceClaimsAndClinicalBenefits/types.ts but is not documented anywhere in the CER or CEP. The CER must explain:

Why data is pooled (to derive a global performance estimate across heterogeneous study populations)
How pooling is done (weighted average by sample size, grouped by indication + user group + domain + metric + magnitude + performance subject)
Limitations of this approach (heterogeneity across study designs, populations, and settings)

Cross-NC connection: Clinical Review Item 2b

This data pooling gap is also flagged in Item 2b as root cause #2: "Data pooling formula undocumented in regulatory documents." The fix must be coordinated: one description of the pooling methodology in the CER, referenced from both the clinical benefits analysis and the performance claims analysis.

3.6. Clinical literature search

BSI says: "§16.4.4 of the CER seems to state that there are no relevant articles identified on the subject device in the literature."

What the CER actually says (lines 740–744):

12 articles found from PubMed (10) and Google Scholar (2) about the device
All 12 were excluded because they were "proprietary (internal) company articles describing preclinical (in-silico) and non-clinical results"
ClinicalTrials.gov yielded 2 records (IDEI_2023 and COVIDX_2022); these are already counted as pre-market CIs

BSI's concern: Excluding all 12 articles means the clinical literature search found zero usable articles about the device. BSI questions whether the SotA search protocol (with its PICO framework and appraisal methodology) was also applied to the device-specific literature search, or whether different methods were used.

Gap: The CER does not clearly distinguish:

The SotA literature search (227 records → 64 included, about similar/alternative devices and clinical practice), documented in R-TF-015-011
The device-specific literature search (15 records → 0 included, about the subject device), briefly mentioned in CER lines 678–744

The methodology, keywords, and appraisal criteria for the device-specific search should be explicitly described, and any differences from the SotA protocol should be justified.

3.7. Equivalence assessment

BSI says: "§16.24 of the CER seems to state that improvements have been made. It is unclear what these changes are."

CER line 563: "The improvements introduced in Legit.Health Plus, mainly related to software version stabilisation and the consolidation of features..."

CER line 433: "All differences between the two versions are solely documentary..."

These are contradictory: "solely documentary" vs "software version stabilisation and consolidation of features." BSI rightly flags this. The CER must either:

Clarify that the changes are purely documentary (remove the "improvements" language), or
List what specifically changed and justify that the changes do not impact clinical safety/performance

The equivalence tables (lines 498–557) use "Same" for almost every row, which BSI reads as superficial. More detail is needed on:

What specific software changes were made (even if documentary)
How these were assessed for impact on performance
Reference to any design change records

3.8. Dermatoscopic camera concern

BSI says: "photos taken with dermatoscopic camera only (is this representative of how the device will be used)"

What the studies used:

MC_EVCDAO_2019: DermLite Foto X dermatoscope with smartphones (Pixel 3, Galaxy S10, iPhone X)
IDEI_2023: Mix of dermatoscopic (87.5% retrospective) and clinical (100% prospective) images
Other studies (BI_2024, PH_2024, SAN_2024): Varied, some use clinical images only

Intended device use: The IFU states the device accepts clinical (non-dermoscopic) images. The device includes DIQA (Dermatology Image Quality Assessment) to validate image quality.

Gap: The CER does not explicitly address whether the validation studies are representative of real-world image acquisition. If most studies used dermatoscopic images but the device is intended for clinical images taken by non-specialist users, there is a representativeness concern. The CER needs a discussion of:

Which studies used which image types
How performance varies between dermatoscopic and clinical images
Why the evidence is representative of intended use

3.9. Population coverage

BSI says: "How do the CIs sufficiently cover all/representative patient populations (age, pigment, sex, etc) and indications"

CER line 840: "Over 800 patients across eight pivotal studies."

Gap: The CER does not provide a demographic breakdown across studies. The PSUR notes GDPR-driven data minimisation (no demographic collection beyond clinical necessity), which is a legitimate regulatory constraint but creates a gap for BSI. The CER needs to:

Describe what demographic data IS available from each study
Justify any gaps in demographic coverage by reference to GDPR data minimisation principles
Discuss Fitzpatrick skin type coverage (critical for AI dermatology devices)
Address coverage of malignant/high-risk conditions specifically

4. Gap analysis

#	BSI concern	What we have	What's missing	Severity
1	Clinical benefits hard to follow	React components with programmatic claim assignment	Narrative CER analysis explaining benefits, methodology, limitations; static-friendly tables	High
2	Performance claims hard to follow, broken links	148 claims in `performanceClaims.ts`, dynamic tables	Same as above; PDF-exportable summary; broken link fix	High
3	Data pooling unclear	`globalValueOfDevice` computation in code	CER documentation of pooling methodology, justification, limitations	High
4	Unmet acceptance criteria not discussed	All criteria appear to be met (or borderline with CIs crossing thresholds)	Per-study reconciliation table with explicit met/not-met status and justifications	High
5	CI regulatory details (AEMPS, registration, publication, deviations)	Ethics approval exists for MC_DAO; IDEI has ClinicalTrials.gov entries; deviation documented in CIP	CER text explicitly stating: competent authority communications, NCT numbers, publication status, protocol deviations for each CI	High
6	CI methodology justifications	Sample size calculations in CIPs; DIQA quality validation	CER narrative on: photo quality removal rationale, dermatoscopic vs clinical images, MC_DAO 200→105 justification, sample size adequacy	High
7	Population coverage	800+ patients across 8 studies; limited demographics due to GDPR	Demographic breakdown per study; Fitzpatrick coverage; malignant condition coverage analysis	Medium
8	Equivalence lacks detail	Equivalence tables exist; "Same" comparisons	Clarify "improvements" vs "documentary changes" contradiction; list specific changes; impact assessment	High
9	No subject-device literature	12 articles found but excluded (preclinical/internal)	Explain why excluded articles are not clinical data; clarify protocol differences between SotA and device search	Medium
10	No PMS data in CER	PSUR and PMS Report document 7 non-serious incidents, 0 serious, 0 FSCAs	Integrate legacy PMS data into CER: complaints, incidents, trend analysis, safety conclusions	Critical

5. Cross-NC connections

Clinical Review Item 2b: Clinical benefits, performance, safety vs SotA

Item 2b research identified overlapping gaps:

SotA traceability: acceptanceCriteriaStateOfTheArtValue exists in data but provenance chain to specific SotA articles is broken → same gap applies to Item 3a's "traceability to outcomes" concern
Data pooling documentation: globalValueOfDevice undocumented → same as Item 3a gap #3
Top-1 accuracy: Some Top-1 metrics appear below SotA baselines → may be what BSI means by "some acceptance criteria have not been met"
Use environment: remote care sub-criterion within benefit 3KX vs intended use environment → tangentially relevant to IDEI study's teledermatology context

Clinical Review Item 3b: Data sufficiency justification

Item 3b asks for justification that "sufficient data in quantity and quality has been analyzed." The research here directly feeds into 3b:

Sample size adequacy (800+ patients, formal calculations)
Population representativeness
Data quality methodology (DIQA)

Technical Review M1.Q1: IFU performance claims

M1.Q1 research has a cross-NC connection section identifying shared issues:

SotA baselines traceability
Top-1 accuracy vs IFU claims
Data pooling documentation
239 vs 346 ICD-11 category reconciliation

6. Response strategy

Approach: Fix-then-reference

Per the BSI NC CLAUDE.md, the workflow is: analyse → fix the documentation → write the response referencing what was fixed.

Fixes required in the CER (R-TF-015-003)

Fix 1: Integrate legacy PMS data (Critical)

Add a new subsection under "Clinical data generated from risk management and PMS activities" that:

Summarises the legacy device's market experience (2020–Q1 2026: 21 contracts, approximately 250,000 reports, 500+ practitioners, 100,000+ patients)
Summarises Category 3a customer-reported events (three in total, all closed, no patient harm)
Confirms zero Article 87 serious incidents, zero Article 88 trend reports, zero FSCAs
Analyses trends and CAPA outcomes
Draws safety conclusions from the market data
References the consolidated R-TF-007-003 Post-Market Surveillance Report (legacy device)

Fix 2: Add per-CI regulatory detail table

For each clinical investigation, add explicit documentation of:

Competent authority communication (or statement that none was required for observational studies under national law)
Clinical trial registration status and numbers
Publication status
Protocol deviations (or explicit "none" statement)
Ethics committee approval reference

Fix 3: Add acceptance criteria reconciliation

Add a per-study table showing:

Each acceptance criterion
The achieved value
Met/not-met status
Justification for any borderline or unmet criteria
This should be presented as static content (not React components) to ensure readability in PDF

Fix 4: Document data pooling methodology

Add a subsection to the CER explaining the globalValueOfDevice computation:

Formula and grouping criteria
Justification for pooling across studies
Limitations (heterogeneity, design differences)
Cross-reference to the CEP's study design rationale

Fix 5: Clarify equivalence "improvements" language

The changes between legacy and Plus are confirmed to be minor technical changes (software version stabilisation, feature consolidation), not purely documentary. This means the CER's line 433 ("solely documentary") is inaccurate and must be corrected.

Action required:

Remove the "solely documentary" claim from line 433
Create a formal change list comparing legacy to Plus, with each change explicitly assessed for impact on clinical safety and performance (no such document currently exists; it needs to be created)
Update the equivalence section to reference the change list and conclude that no change impacts clinical safety/performance

Fix 6: Expand clinical literature discussion

Explain why the 12 excluded articles are preclinical/non-clinical and therefore not "clinical data" per Article 2(48)
Clarify whether the device-specific search used the same protocol as the SotA search
If different, document the differences and justify

Fix 7: Add methodology justification narrative

For each study, add discussion of:

Why image quality exclusion is appropriate (DIQA mirrors real-world use because the device itself rejects poor quality images)
Dermatoscopic vs clinical image coverage across the study portfolio
MC_EVCDAO_2019 sample size rationale (exceeded melanoma ratio target at 105 subjects; statistical power maintained)
Overall sample size adequacy across the 800+ patient portfolio

Fix 8: Improve population coverage narrative

Compile available demographic data from each study
Address Fitzpatrick skin type representation
Address malignant/high-risk condition coverage (melanoma, SCC, BCC representation across studies)
Justify any demographic gaps with reference to GDPR data minimisation and study design constraints

Fixes required elsewhere

Clinical Benefits and Performance Claims components

The interactive components need to be made BSI-reviewer-friendly:

Fix broken links to clinical investigations
Consider generating static summary tables that can be included in the CER as fallback
Ensure the PDF export (if provided) renders the data correctly

Document delivery problem

A significant portion of BSI's Item 3 observations may stem from the CER being reviewed as a rendered web page or PDF where React components did not render correctly. This is a cross-cutting issue that affects Items 2a, 2b, 3a, and 3b. We need a strategy for presenting dynamic data to BSI in a format they can review.

7. Risk assessment

Risk	Impact	Mitigation
PMS data gap is the most visible deficiency; BSI explicitly states "no discussion of data from the market...is found"	If not fixed, BSI will escalate: this is a direct violation of Article 2(48) (clinical data includes PMS)	Priority 1: integrate legacy PMS data into CER before responding
Equivalence "improvements" contradiction could undermine the entire equivalence claim	If BSI concludes the devices are NOT equivalent, all legacy clinical data becomes inapplicable	Clarify language immediately; remove "improvements" or provide detailed impact assessment
Interactive component rendering issues could make our response unintelligible to BSI	If BSI cannot read the clinical benefits/performance data, they will escalate regardless of content quality	Provide static summary tables alongside or instead of component references
MC_EVCDAO_2019 200→105 could be read as underpowered study	BSI may question whether the reduced sample size maintains statistical validity	Document that the melanoma ratio (34.29%) exceeded the target (20%), maintaining statistical power for the primary endpoint

8. Addressed weaknesses (BSI auditor perspective)

Resolved

#	Item	Resolution
4	Document delivery format	Confirmed: PDF export. React components did not render interactively. This is a confirmed root cause for "links don't work" and "hard to follow" observations.
5	Fitzpatrick data	Resolved: We collected the data from the different studies and added a table to the CER.
	Equivalence: "improvements" vs "documentary"	Resolved: minor technical changes (not purely documentary). A formal change list with impact assessment was created and added to the CER.

Pending: questions for Jordi

See question-for-jordi.mdx for the full list. Summary:

Ethics Committee communications: Resolved: Added table to CER.
ClinicalTrials.gov NCT numbers: Resolved: Added table to CER.
Publication status: Resolved: Added table to CER.
Fitzpatrick details: Resolved: Added table to CER.

Regulatory framework: what the BSI meeting revealed

Severity warning: Item 3 is Critical

Nick stated that refusal is extremely likely. Item 3 is the centrepiece of the clinical review: it is where BSI assesses whether sufficient clinical evidence exists to support certification. The combination of gaps identified, MRMC studies incorrectly framed as primary clinical evidence, PMS data absent from the CER, MEDDEV stage narration missing, and the "Level 1 and 2" claim that directly contradicts MDCG 2020-6 Appendix III, collectively represents the most serious gap in the entire clinical review. Every one of these must be resolved.

The four applicable guidance documents

Document	Role for Item 3a
MEDDEV 2.7.1 Rev 4, Stages 1–4 (Sections 8–10)	The four MEDDEV stages are the framework that makes a clinical evaluation work. Nick stated explicitly: "Unless you complete stages 0, 1, 2, 3, 4, 5 — they won't work and it will fall down." The CER must narrate each stage visibly: Stage 1 = identification of all pertinent data (literature search + manufacturer-held data); Stage 2 = appraisal of each dataset with named validated tools; Stage 3 = analysis covering every indication, every user group, every population, full duration; Stage 4 = continuous updating. Currently the CER presents data without narrating these stages. A BSI reviewer following the MEDDEV checklist cannot complete the assessment because the stages are not visible.
MEDDEV 2.7.1 Rev 4, Section 9 (Stage 2 appraisal)	Each dataset must be individually appraised for: methodological quality (study design, sample size, power calculation, endpoints, controls, GCP compliance); relevance (pivotal vs. supporting); and weighting (defined criteria). Per MDCG 2020-6 § 6.3, validated appraisal tools must be named: Cochrane RCT tool, MINORS, Newcastle-Ottawa Scale, or IMDRF MDCE WG/N56 Appendix F. The current CRIT1-7 appraisal framework used in R-TF-015-011 must be mapped to one of these recognised tools; if it cannot be shown to be equivalent, it must be supplemented or replaced; complaint/incident ratios alone are not sufficient to prove safety.
MDCG 2020-6, Appendix III	12-level evidence quality hierarchy. Rank 11 = "simulated use / animal / cadaveric testing with HCPs" = NOT clinical data under MDR. Nick confirmed this position explicitly during the BSI meeting. The majority of our pivotal studies (BI_2024, PH_2024, SAN_2024 as MRMC image reviews) map to Rank 11. The CER executive summary's current claim of "high-quality clinical data (Level 1 and 2 according to the hierarchy of clinical evidence)" directly contradicts MDCG 2020-6 Appendix III. This claim must be corrected before submission; it is the kind of verifiable factual error that undermines the entire document's credibility with BSI. The correct per-study evidence ranking per Appendix III must replace it.
MDCG 2020-6, § 6.4	"Sufficient clinical evidence must exist PRIOR to MDR certification. PMCF cannot fill pre-market gaps." The legacy device's 4+ years of market experience (21 contracts, 250,000+ reports, 7 non-serious incidents, 0 serious incidents, 0 FSCAs) is clinical data per MDR Article 2(48). It must be integrated into the CER as Stage 3 analysis, not delegated to the PSUR. The CER's current statement that "there are currently no retrospective PMCF data" (line 660) is factually incorrect; 250,000+ reports over 4 years is extensive PMS data. BSI flagged this explicitly.
MDCG 2020-1	Three-pillar framework for MDSW. Each study in the clinical evidence portfolio must be explicitly mapped to the pillar(s) it supports: VCA (scientific association, from literature review), Technical Performance (algorithm accuracy in controlled conditions, from MRMC studies), or Clinical Performance (validated accuracy in real intended-use context, from real-world studies). MRMC studies contribute to Technical Performance but not Clinical Performance. The CER must present this pillar mapping explicitly, not just list studies.
MDCG 2020-13, Section D	BSI checks the literature review against Section D: search protocol, databases used, PICO/PRISMA methods, inclusion/exclusion criteria, both favourable AND unfavourable data, full documentation set (protocol + reports + retrieved list + excluded list with reasons + full-text copies). The device-specific literature search result (12 articles excluded) must be documented in full against Section D, not just mentioned in passing.

The narration analogy: what Nick identified as the core problem

Nick and Erin both described the same fundamental issue from different angles: the clinical evaluation data exists, but is not narrated according to the recognised regulatory framework. Nick: "We're not even sure what we're seeing at this point because of the way it's being presented." Erin: "It's not clear if you make a statement how that statement is supported, and that traceability element is usually what is lacking."

The analogy identified by Taig in the internal debrief (2026-03-26): the software team had done all the development work correctly but had not framed it according to IEC 62304 design phases. The non-conformity was not about missing work; it was about missing narration. The fix was not to redo the software; it was to restructure how it was described. The same pattern applies to the clinical evaluation.

The primary fix for Item 3a is therefore narrative restructuring, not new data collection. The CER must be re-narrated according to MEDDEV 2.7.1 Rev 4 Stages 1–4, using the exact terminology of the standards, so that a reviewer can follow the chain: identified data → appraised data → analysed data → conclusions.

Correcting the "Level 1 and 2" evidence quality claim

The CER executive summary claims "high-quality clinical data (Level 1 and 2 according to the hierarchy of clinical evidence)." Per MDCG 2020-6 Appendix III (the only applicable hierarchy under MDR), the correct ranking for each study is:

Study	Design	MDCG 2020-6 Appendix III rank
MC_EVCDAO_2019	Prospective analytical observational, 105 patients	Rank 2–4 (methodological quality dependent)
IDEI_2023	Prospective + retrospective, 202 patients	Rank 4
COVIDX_2022	Prospective observational, 160 patients	Rank 4–7
DAO_O_2022	Prospective longitudinal, 117 patients	Rank 4
DAO_PH_2022	Prospective longitudinal, 131 patients	Rank 4
BI_2024	MRMC image review, 15 HCPs × 100 images	Rank 11 (simulated use)
PH_2024	MRMC image review, 9 PCPs × 30 images	Rank 11 (simulated use)
SAN_2024	MRMC image review, 16 practitioners	Rank 11 (simulated use)
AIHS4_2025	Retrospective longitudinal, 2 patients	Rank 4–7 (very small N)
Legacy PMS data	Vigilance/PMS, 250,000+ reports	Rank 7

This ranking must replace the "Level 1 and 2" claim in the CER. The claim must be rewritten as: "The clinical evidence portfolio includes studies at MDCG 2020-6 Appendix III Rank 2–4 (high-quality observational studies in real-world clinical settings) as primary evidence, supported by Rank 7 legacy market data and Rank 11 simulated-use studies as corroborating technical performance evidence."

X-3 disease categorisation: the framework for Stage 3 analysis

The X-3 decision (2026-03-28) provides the structure for how MEDDEV Stage 3 "every indication" coverage must be demonstrated. The three-tier structure:

Tier 1 (malignant conditions): Individual analysis per condition. MEDDEV A7.3 requires sensitivity/specificity for major clinical indications individually. Evidence: MC_EVCDAO_2019 (Rank 2–4, 105 patients, 36 melanoma, AUC 0.8482) and IDEI_2023 (Rank 4, 202 patients). These are the only studies that individually satisfy the "major clinical indication" requirement per MEDDEV A7.3 for high-risk conditions.

Tier 2 (rare diseases): Grouped analysis with risk-based justification. Evidence: BI_2024 (Rank 11 as MRMC, but corroborated by PH_2024 and SAN_2024). The acceptance criterion (54% absolute accuracy) must be traced to SotA baseline PCP accuracy for rare skin diseases (~30–40% unaided).

Tier 3 (general conditions, 97% of epidemiological coverage): Pooled analysis with the four-point risk-based justification from X-3. Real-world studies (COVIDX, DAO-O, DAO-PH) provide Rank 4 evidence for this tier.

Declared acceptable gaps: Autoimmune (3% prevalence, Gap A) and genodermatoses (1% prevalence, Gap B), justified per MDCG 2020-6 § 6.5(e) and addressed by PMCF Activities D.1 and D.2 (specified in issue-6-pmcf.mdx).

This structure, three tiers of evidence with declared gaps, is what satisfies MEDDEV Stage 3's requirement to cover every indication. It must be presented explicitly in the CER, in the order of the evidence hierarchy: Tier 1 (strongest, most scrutinised), Tier 2 (grouped, justified), Tier 3 (pooled, justified), then gaps (declared acceptable, addressed by PMCF).

PMS data: the regulatory obligation per Article 2(48)

Per MDR Article 2(48), clinical data explicitly includes "safety or performance information generated from PMS." The legacy device's market experience is therefore not background information; it is clinical data that must be analysed in the CER under MEDDEV Stage 3. The analysis must include:

Summary of market experience (2020–2024: 21 contracts, 250,000+ reports, 500+ practitioners)
All non-serious incidents (7 in 2023): complaint types, investigation outcomes, CAPA actions
Confirmation of zero serious incidents and zero FSCAs
Safety conclusion drawn from this data: the absence of serious incidents over 4+ years of market use demonstrates that the severity 4 risk ceiling in the AI risk assessment is justified in practice

This analysis must appear as a standalone section in the CER, not merely as a reference to the PSUR. BSI reviews the CER in isolation; they will not automatically cross-reference the PSUR unless the CER explicitly integrates and summarises the PSUR's findings.

1. What BSI is asking​

BSI's five observation areas (mapped to regulatory requirements)​

2. What BSI reviewed​

3. Relevant QMS documents and findings​

3.1. CER: Commercialisation status (lines 427–441)​

3.2. CER: PMS data gap (lines 651–662)​

3.3. Clinical investigations: Regulatory details​

MC_EVCDAO_2019 (mc-evcdao-2019/r-tf-015-006.mdx)​

IDEI_2023 (idei-2023/r-tf-015-006.mdx)​

3.4. Acceptance criteria: Met vs. not met​

3.5. Data pooling and the globalValueOfDevice​

3.6. Clinical literature search​

3.7. Equivalence assessment​

3.8. Dermatoscopic camera concern​

3.9. Population coverage​

4. Gap analysis​

5. Cross-NC connections​

Clinical Review Item 2b: Clinical benefits, performance, safety vs SotA​

Clinical Review Item 3b: Data sufficiency justification​

Technical Review M1.Q1: IFU performance claims​

6. Response strategy​

Approach: Fix-then-reference​

Fixes required in the CER (R-TF-015-003)​

Fix 1: Integrate legacy PMS data (Critical)​

Fix 2: Add per-CI regulatory detail table​

Fix 3: Add acceptance criteria reconciliation​

Fix 4: Document data pooling methodology​

Fix 5: Clarify equivalence "improvements" language​

Fix 6: Expand clinical literature discussion​

Fix 7: Add methodology justification narrative​

Fix 8: Improve population coverage narrative​

Fixes required elsewhere​

Clinical Benefits and Performance Claims components​

7. Risk assessment​

8. Addressed weaknesses (BSI auditor perspective)​

Resolved​

Pending: questions for Jordi​

Regulatory framework: what the BSI meeting revealed​

The four applicable guidance documents​

The narration analogy: what Nick identified as the core problem​

Correcting the "Level 1 and 2" evidence quality claim​

X-3 disease categorisation: the framework for Stage 3 analysis​

PMS data: the regulatory obligation per Article 2(48)​