Pre-submission review of R-TF-015-001 CEP and R-TF-015-003 CER

Review date: 2026-04-19 Reviewers: audit-deliverable-reviewer + bsi-clinical-auditor (emulating Erin Preiss / Nick) Files reviewed:

apps/qms/docs/legit-health-plus-version-1-1-0-0/product-verification-and-validation/clinical/Evaluation/R-TF-015-001-Clinical-Evaluation-Plan.mdx (985 lines)
apps/qms/docs/legit-health-plus-version-1-1-0-0/product-verification-and-validation/clinical/Evaluation/R-TF-015-003-Clinical-Evaluation-Report.mdx (2,638 lines)

Verdict: Neither document is audit-ready. The CEP has CRITICAL MDX formatting corruption, a commercialisation-status contradiction, and a Plan-written-as-Report framing defect. The CER has filesystem-path leaks, Pillar 3 evidence under-powering for the claimed indication scope, and an over-weighted Rank 4 classification of the legacy RWE study.

Part A — R-TF-015-001 Clinical Evaluation Plan (CEP)

A.1 — Audit-deliverable leaks (audit-deliverable-reviewer)

CRITICAL leaks (block submission)

A.1.C1 — Line 808, 849, 853: corrupted MDX escape/italic rendering

Three lines contain the mixed \_..._* pattern that will render as a literal backslash-underscore before "Analysis" and an unclosed asterisk in the exported PDF. Lines 849 and 853 also have APASI*2025 and NMSC*2025 (missing underscore).

Line 808: section \_Analysis of published severity validation studies*
Line 849: APASI*2025, AUAS_2023, AIHS4_2023, and ASCORAD_2022 ... section \_Analysis of published severity validation studies*
Line 853: NMSC*2025 ... section \_Per-study evidence appraisal*

Fix: replace with section "Analysis of published severity validation studies" etc., and restore underscores in study codes (APASI_2025, NMSC_2025).

A.1.C2 — Line 286 vs. lines 31, 401, 676, 888: commercialisation-status contradiction

Line 286 says "The product is a legacy device which has been marketed since 2020"; lines 31/401/676/888 say the opposite (first CE-marking, not yet commercialised). The legacy device (Legit.Health) has been marketed since 2020; Legit.Health Plus is first-time MDR submission. Line 286 conflates the two devices and undermines the equivalence argument.

A.1.C3 — Line 875: "AI Labs Group S.L." inside study objective violates QMS wording rule

Line 875: "Primary objective: To ascertain the validity of the device, leveraging artificial intelligence developed by AI Labs Group S.L., in objectively and reliably tracking...".

Context: AI Labs Group S.L. is the current legal entity of the manufacturer; Legit.Health is the product brand. The legal entity is correctly disclosed once via <ManufacturerDetails /> at line 294. Repeating it mid-sentence in a study objective violates the QMS wording rule in apps/qms/CLAUDE.md: "NEVER use the company name. Use 'we', 'us', or 'the manufacturer' instead. Enables rebranding and avoids third-person self-reference."

Fix: strip the phrase to "Primary objective: To ascertain the validity of the device in objectively and reliably tracking the progression of chronic dermatological conditions."

Line 665 (literature search "Legit.Health" and "AI Labs Group"): no action required. Searching both the product brand and the legal entity name is standard literature-search practice because authors may affiliate with either — this is a sound query, not a leak. (Note: this finding was initially mis-flagged by the audit-deliverable-reviewer as a "former legal entity" issue; the agent was wrong — AI Labs Group S.L. is the current legal entity, and the dual-term search is the correct way to capture all manufacturer-associated publications.)

HIGH

Line 316 — Brand name "Legit.Health" used in IFU-style sentence instead of "the manufacturer"; voice shift to second person.
Line 137 — Subject-verb disagreement: "a deep equivalence analysis have been carried out" → "has been carried out".
Line 251 — Martorell experience: 17 years in column vs. "over 10 years" in justification; unify.
Line 660 — Informal honorific "Mr. Jordi Barrachina" — drop honorific.
Lines 654–655 — Bullet split mid-sentence ("...searches the body and text of articles" / "for search terms...").
Line 608 — "The current Clinical Evaluation aims to confirm..." — CEP should define, not report.
Line 594 — Literal placeholder "(x occurrence per patient)".
Line 371 — Engineering changelog language ("migration to microservices, HL7 FHIR implementation, database encryption upgrades") belongs in Software Architecture Description, not CEP.

MEDIUM

Line 401 — "is not yet CE marked and has not been commercialized yet" — double "yet".
Line 676 — Non-native "has not still been commercialized".
Line 248 — "a long experience" non-idiomatic; 4 years for Class IIb MDR CAM is thin.
Lines 873–882 — Study-code inconsistency: BI 2024 in the table vs. BI_2024 in prose; unify on underscore form.
Line 660, 880 — Date-format inconsistency ("July 15, 2025", "November 13rd, 2024", "September 13rd" — "13rd" is a typo).
Line 427 — Word-salad: "medical devices used in dermatology".
Line 156 — R-TF-015-004 referenced as single ID for all 9 CIPs; precision needed.
Line 495 — "clinical equivalence and continuity" is not a standard MDCG term.

LOW

Line 371 — "feature consolidation" imprecise
Line 389 — NLP not in acronym table
Line 427 — Table cell runs ~300 words of continuous prose
Line 985 — Trailing U+3164 Hangul filler after <Signature />
Line 564 — "62 distinct risk IDs" stated without RMF cross-citation

A.2 — BSI clinical auditor findings (Erin / Nick)

CRITICAL non-conformities

A.2.C1 — Plan written retrospectively

Lines 27, 147, 152, 775, 873–881 — the CEP describes "8 pivotal clinical investigations were designed and conducted" as past tense, with all studies marked "State of process: Completed" and dates from 2024–2025. A Plan must be written in advance (MDR Article 61(3); Annex XIV Part A §1(a); MEDDEV 2.7.1 Rev 4 Annex A3).

Fix: version-history table with Rev 1 date; re-grammar to future/planned tense; move genuinely completed studies to a distinct "Already-completed pivotal investigations contributing to this evaluation" section with Article 120(3) framing.

A.2.C2 — Classification rationale: three prose-level weaknesses in an otherwise correct IIb defense

Resolution of the earlier concern. The device's Class IIb classification under MDR Rule 11 is correct and well-framed. The authoritative defense in r-tf-description-and-specification.mdx §"Guideline MDCG 2019-11" (lines 116–187) applies the IMDRF SaMD matrix from MDCG 2019-11 Annex III:

Significance of information: Drives Clinical Management (IMDRF 5.1.2) in all cases. The device outputs probability distributions and severity scores that aid HCP decisions; biopsy + histopathology remains the gold standard for diagnostic confirmation; the IFU positions the device as decision-support, not standalone diagnosis. This is correct.
State of healthcare situation: Serious (IMDRF 5.2.2) for the large majority of the 346 ICD-11 categories in the indication; Critical (IMDRF 5.2.1) for melanoma (ICD-11 2C30).
Applicable cell: Critical × Drives = Category III.i = Class IIb. Under Rule 11, classification follows the highest applicable cell across the intended use; the inclusion of melanoma in the indication is the direct regulatory basis for Class IIb.

The earlier framing in this review ("Rule 11 pushes melanoma-impact decisions toward Class III"; "Class IIa framing elsewhere — inter-document inconsistency") is retracted. The IIa appearance within the matrix analysis is the would-be cell for the 345 non-melanoma categories, correctly superseded by the melanoma Critical cell; it is not a parallel IIa claim. The classification itself is not a non-conformity. The prose carrying the defense has three specific weaknesses that must be fixed before submission.

Weakness 1 — CEP lines 350–359: generic "potential for harm" rationale does not work through Rule 11

Current CEP classifies the device as IIb on the basis that "significant potential for patient harm from a misdiagnosis justifies classifying the software as Class IIb." This phrasing does not walk Rule 11's decision tree, does not invoke the IMDRF 5.1.2 / 5.2.1 / 5.2.2 mapping, and does not cross-reference the authoritative MDCG 2019-11 matrix analysis. A BSI reviewer reading the CEP in isolation will not see the reasoning that actually supports IIb, and may probe the generic rationale alone.

Fix: replace the body of "Device classification" (lines 350–359) with a pointer to the single authoritative defense:

"The device is classified as Class IIb under MDR Rule 11 (Annex VIII, Regulation (EU) 2017/745). The full classification analysis — including IMDRF 5.1.2 / 5.2.1 / 5.2.2 mapping per MDCG 2019-11 Annex III — is documented in R-TF-DESCRIPTION-AND-SPECIFICATION §'Guideline MDCG 2019-11'."

Weakness 2 — r-tf-description-and-specification.mdx lines 133 and 185: "err on the side of caution" framing makes IIb sound voluntary

Current text (line 185): "we have decided to classify the device as Class IIb, even though the guidance suggests Class IIa, because we prefer to err on the side of caution." (Line 133 carries the same framing.)

This reads as a strategic manufacturer choice rather than a regulatory duty. Two BSI exposures: (a) it implies the manufacturer wanted IIa and conceded to IIb, which invites probing of whether the classification is a discretionary position that could be walked back; (b) it suggests the clinical-evidence bar may have been set for IIa, creating a follow-on concern about evidence sufficiency for a IIb device. The regulatory truth is that IIb is not voluntary — it is the direct application of Rule 11's "highest applicable cell" logic to an indication that includes a Critical condition (melanoma).

Fix: replace lines 133 and 185 with regulatory-consequence prose:

"The device's intended use spans a broad indication set of 346 ICD-11 categories. For the large majority of these categories, the state of healthcare situation meets the IMDRF 5.2.2 Serious criteria; the applicable cell for those outputs is Serious × Drives (Category III.ii, Class IIa). The indication also includes melanoma (ICD-11 2C30), which meets the IMDRF 5.2.1 Critical definition because delayed diagnosis can progress to metastatic disease with irreversible consequences. For this subset of the intended use, the applicable cell is Critical × Drives (Category III.i, Class IIb).

Under MDR Rule 11, device classification is determined by the highest applicable cell across the intended use. The device is therefore classified as Class IIb."

Weakness 3 — r-tf-description-and-specification.mdx line 179: "0.03% of categories" argument is risky and unnecessary

Current argument: "only 0,03% of categories are time-critical, but most importantly: the device's purpose is not to determine whether or not that one time-critical condition is present (positive / negative), but to provide an distributions of all ICD categories."

This effectively argues that the melanoma output is less "critical" because it is one probability among 346. A rigorous BSI reviewer (Nick) will read it as an attempt to dilute the melanoma output's criticality and will push back: a probability value for melanoma inside an output distribution is still a melanoma detection output; its statistical weight within the distribution does not change the criticality of the condition. The argument is also unnecessary — IIb is already correctly obtained via the melanoma carve-out in the matrix analysis above.

Fix: retain the IMDRF 5.2.2 requirement-by-requirement table (lines 175–181) — it is thorough and genuinely correct for the 345 non-melanoma categories. But scope the 5.2.2 analysis explicitly and remove the "0.03%" framing. Insert a scope statement above the table:

"The following IMDRF 5.2.2 Serious analysis applies to the 345 ICD-11 categories in the indication that meet the Serious criteria. Melanoma is evaluated separately under IMDRF 5.2.1 Critical and is the basis for the Class IIb classification (see matrix above)."

Then rewrite the "Intervention is normally not expected to be time-critical..." row's clarification as: "Across the 345 categories to which this analysis applies, none is time-critical. Melanoma is evaluated under IMDRF 5.2.1 and is outside the scope of this row."

Cross-document consistency check (must run before submission)

The authoritative classification defense lives in r-tf-description-and-specification.mdx §"Guideline MDCG 2019-11". All other documents must say Class IIb, Rule 11 consistently and cross-reference that defense rather than re-deriving the classification with a different rationale:

CEP (R-TF-015-001) — line 12 summary + lines 350–359 rationale (see Weakness 1)
CER (R-TF-015-003) — every classification reference
Declaration of Conformity (r-tf-001-007-eu-doc.mdx)
STED (r-tf-sted.mdx)
IFU (apps/eu-ifu-mdr/)
Product labeling
Risk Management File (R-TF-013-002)
GSPR checklist
Software classification and cybersecurity documentation

Any residue of "Class IIa" outside the matrix-analysis context (where IIa is correctly identified as the would-be cell for 345/346 categories), or any generic "significant potential for harm" rationale outside a cross-reference to the MDCG 2019-11 matrix, is an inter-document inconsistency that BSI will probe.

A.2.C3 — MDCG 2020-5 equivalence demonstration is load-bearing under Route A and is currently too thin

Why this is critical. Under Route A (see A.2.C4 — Plus is first-CE-marked under MDR as a new device; legacy stays on the market under Article 120(3)), legacy's clinical data — the pivotal investigations conducted with legacy, the Rank 7 passive PMS corpus, and the Rank 4 legacy RWE study R-TF-015-012 — enters Plus's clinical evaluation only through the Article 61(5)–(6) equivalence route governed by MDCG 2020-5. The equivalence demonstration is therefore not a formality; it is the load-bearing wall of Plus's clinical evidence strategy:

If BSI accepts equivalence → legacy data counts as equivalent-device evidence for Plus → the evidence portfolio holds together as planned.
If BSI rejects or partially accepts equivalence → only Plus-specific pre-market data supports Plus → the Pillar 3 base effectively collapses to zero, because the pivotal investigations and PMS corpus were all conducted with legacy. Plus's application fails.

The equivalence demonstration must therefore be rigorous, structured, and visibly defended in the CEP itself — not deferred entirely to the CER. Lines 137–139 currently say: "a deep equivalence analysis have been carried out between both devices and the results and conclusions of it are documented in R-TF-015-003 Clinical Evaluation Report, section 16.6. ... Biological equivalence is not applicable as the device is a software-only medical device with no contact with the human body." This is insufficient as a plan-level treatment of an argument this load-bearing.

What MDCG 2020-5 requires (§A2.1 and its table templates). Three characteristic families, each with a characteristic-by-characteristic comparison:

Technical characteristics — same design; same conditions of use; same specifications and properties (including, for software, AI model architecture and weights, training data, output space, deployment method, operating environment, integration interfaces); same principles of operation; same critical performance requirements. Each Plus-vs-legacy difference must be tabulated with a per-difference clinical-relevance assessment, not simply asserted as "no clinical impact" at the aggregate level.
Biological characteristics — same materials / substances in contact with the body. N/A for software-only devices; one-line justification is sufficient (the current CEP handles this correctly).
Clinical characteristics — same clinical condition (incl. severity and stage); same site of use; same intended population (age, anatomy, physiology); same kind of user; similar critical clinical performance for the intended purpose, with any residual differences assessed for clinical significance.

MDR Article 61(5)(b) also requires the manufacturer to have "sufficient levels of access to the data" of the equivalent device. Trivially satisfied here (same manufacturer for legacy and Plus), but must still be stated.

Confirmed context (per the manufacturer, 2026-04-19): all Plus-vs-legacy differences are Plus-only — legacy's production architecture has not been updated (no microservices, no FHIR, no encryption upgrades pushed to legacy). This is the ideal setup for a clean MDCG 2020-5 demonstration: legacy is a fixed reference point, and Plus diverges from it by a specifiable set of additions. Importantly, this also means legacy's Article 120(3) protection is clean — no changes to legacy, no MDCG 2020-3 significant-change assessment to defend.

Specific deficiencies in the current CEP

No inline equivalence table. The CEP commits to the existence of a detailed analysis in the CER but does not summarise it. A CEP evaluated in isolation cannot verify that the methodology is sound.
The Plus-vs-legacy difference list is underspecified. Line 371 enumerates only three deployment-architecture differences: "migration to microservices, HL7 FHIR implementation, database encryption upgrades." Several other known differences are unaddressed:
- AI model weights, versions, and training-data lineage — Plus and legacy must either share identical deployed weights (trivial equivalence at the algorithmic layer) or have documented divergence with per-model clinical-relevance assessment.
- Novel features claimed elsewhere as Plus-only — specifically the six binary malignancy-surfacing safety indicators and the P₂=1 architectural severity-prioritisation constraint, cited as novel features in r-tf-description-and-specification.mdx. Novel features are by definition not equivalent to anything in legacy; they require Plus-specific pre-market evidence and cannot be carried by the equivalence argument.
- Severity scales and algorithms — APASI, AUAS, AIHS4, SCORAD: for each scale, is the Plus implementation identical to legacy's, or added / retrained / recalibrated? Each is a separate clinical output.
- ICD-11 category coverage — Plus claims 346 categories (line 683). Legacy's claimed set must be stated; any expansion in Plus needs independent evidence.
- DIQA thresholds and image-acquisition-pipeline calibration — any difference has direct implications for whether legacy's performance evidence transfers to Plus.
No per-difference clinical-relevance assessment. Equivalence is asserted at the aggregate level; per-row clinical judgments are not visible.
"Same Critical Performance Requirements" is asserted in the CER (lines 1259–1262 per the prior CER review) without citation to legacy's performance figures. Pointing to the MDD Technical File "held by the manufacturer" is not sufficient audit evidence — the figures must be summarised in Plus's CER.
Article 61(5)(b) access condition is not explicitly stated.
The distinction between "equivalence with legacy" and "novel Plus features" is blurred. The current CEP frames Plus as essentially-equivalent-to-legacy with minor architectural updates; elsewhere the technical file describes Plus as having novel features legacy does not. These framings must be reconciled: Plus = (legacy core, equivalence-covered) + (novel Plus features, requiring Plus-specific evidence).

What to do

Step 1 — Enumerate the complete legacy→Plus change set, categorised into three buckets

Bucket A — architectural / deployment refactor with no clinical pathway (microservices migration, HL7 FHIR interoperability, database encryption at rest, logging/monitoring). No direct clinical impact; document as equivalent with brief justification.
Bucket B — algorithmically equivalent features deployed differently (same AI model weights served via a different runtime, same output schema rendered differently, same thresholds re-implemented). Requires engineering verification (bit-for-bit output parity testing against legacy) and a clinical-relevance assessment confirming no functional divergence.
Bucket C — novel Plus features, NOT covered by equivalence (the six binary malignancy-surfacing safety indicators, the P₂=1 severity-prioritisation constraint, any severity scale added in Plus that legacy did not have, any ICD-11 categories in Plus's indication set that legacy did not claim). Each requires Plus-specific pre-market evidence.

The current line-371 framing collapses all differences into Bucket A, which is both inaccurate (Bucket C items exist per the technical file) and strategically fragile — if BSI catches a claimed-novel feature elsewhere that was silently absorbed into the equivalence claim, the entire equivalence argument is weakened.

Step 2 — Inline a summary equivalence table in the CEP

Place a table of this shape near line 137 (summary in the CEP; detailed per-row justification remains in the CER):

Characteristic family	Characteristic	Legacy value	Plus value	Bucket	Clinical relevance	Evidence source
Technical	AI model — ICD-11 classifier	ViT weights v1.x	ViT weights v1.x (frozen)	B	Not relevant (output parity verified)	V&V report
Technical	Software architecture	Monolithic	Microservices	A	Not relevant (no clinical output change)	Software Architecture Description
Technical	Data-exchange format	Proprietary JSON	HL7 FHIR	A	Not relevant (format only; same payload)	Interoperability V&V
Technical	Encryption at rest	—	AES-256	A	Not relevant (security, not clinical)	Cybersecurity record
Technical	Malignancy-surfacing safety indicators	Not present	Six binary indicators + P₂=1	C	Novel; requires Plus-specific evidence	Plus-specific V&V + risk controls
Technical	Severity scales	[legacy set]	[Plus set]	Mixed A/B/C per-scale	Per-scale assessment	Per-scale V&V
Technical	DIQA thresholds	v.x	v.x unchanged / v.y recalibrated	B or C	Per-threshold assessment	DIQA V&V
Biological	All	N/A (software only)	N/A	—	—	Justification statement
Clinical	Intended clinical condition	Visible dermatological conditions	Visible dermatological conditions	Same	Equivalent	Intended-Purpose record
Clinical	Intended patient population	[legacy scope]	[Plus scope]	Assess	Per-subgroup relevance	Intended-Use record
Clinical	Intended user	HCPs (PCPs + dermatologists)	HCPs (PCPs + dermatologists)	Same	Equivalent	—
Clinical	Site of use	Primary care + dermatology	Primary care + dermatology	Same	Equivalent	—
Clinical	Critical clinical performance	[legacy figures with study citation]	[Plus figures where distinct]	Assess	Non-inferiority / equivalence per metric	CER §[X.Y]

Step 3 — Explicitly carve out novel Plus features

Insert a dedicated section in both CEP and CER titled "Plus features not covered by equivalence — Plus-specific evidence requirement", listing every Bucket C item with its evidence strategy:

Six binary malignancy-surfacing safety indicators → Plus-specific V&V + risk controls per RMF
P₂=1 architectural severity-prioritisation constraint → Plus-specific architectural testing
Any Plus-only severity scales → per-scale validation evidence
Any Plus-only ICD-11 categories → per-category clinical-performance evidence

This prevents BSI from catching a claimed-novel feature elsewhere in the technical file and treating it as an undisclosed gap in the equivalence analysis.

Step 4 — State the Article 61(5)(b) access condition

One sentence in the CEP and CER: "The manufacturer of Plus is also the manufacturer of the legacy device and therefore has full access to legacy's technical documentation, clinical investigation records, and post-market surveillance data, satisfying MDR Article 61(5)(b)."

Step 5 — Summarise legacy's Critical Performance Requirements with citations

Replace the CER's "Same" assertion for Critical Performance Requirements with legacy's actual performance figures (AUC, sensitivity, specificity for melanoma detection, etc.) citing the specific legacy study reports. Pointing to the MDD Technical File as a container is not audit evidence.

Cross-document consistency

Once the CEP summary table and bucket categorisation are published, verify:

CER §16.6 detailed equivalence analysis matches the CEP's bucket categorisation.
Novel-features section in r-tf-description-and-specification.mdx lists the same Bucket C items.
Software Architecture Description, Risk Management File, and V&V records support the Bucket assignments (especially engineering parity testing for Bucket B items).
Legacy's performance figures cited in the CER are sourced from specific legacy study reports, not from unspecified "MDD Technical File" pointers.

A.2.C4 — Regulatory pathway resolved (Route A): Plus is first-CE-marked under MDR as a new device; legacy stays under Article 120(3); MDCG 2020-6 is not primary guidance

The original inconsistency. CEP lines 31, 376, 676, 888 describe Plus as "first CE-marking under MDR" / "1st commercialisation" (i.e., a new MDR device). Line 789 describes MDCG 2020-6 as "the primary MDR-era guidance for legacy device clinical evaluation" (i.e., the legacy-device transition route). These framings are mutually exclusive — Plus cannot be both a first-CE-marking new device and a legacy device transitioning to MDR.

Resolved framing (confirmed by the manufacturer, 2026-04-19). Plus is a separate NEW device being first-CE-marked under MDR. Legacy Legit.Health remains a distinct product in the portfolio, kept on the market under MDR Article 120(3) transition provisions. All Plus-vs-legacy differences are Plus-only deployments; legacy's production architecture has not been updated. This is "Route A" — Plus as a new MDR device with an equivalence claim to legacy under Article 61(5)–(6).

Implications of Route A for the regulatory pathway

Plus is first-CE-marked as Class IIb under MDR. Article 120 does not apply to Plus — Plus has never held an MDD certificate.
Legacy (MDD Class I, CE-marked 2020) remains on market under Article 120(3). Because legacy has undergone no significant changes since MDD CE marking, Article 120(3)(c) protection is clean. A standing significant-change assessment per MDCG 2020-3 should be filed as a QMS record to document this.
Primary clinical-evaluation guidance for Plus: MDCG 2020-1 (clinical evaluation of MDSW — Pillars 1/2/3). Not MDCG 2020-6.
Equivalence guidance for Plus's use of legacy data: MDCG 2020-5. Legacy's clinical data and PMS corpus enter Plus's CER via Article 61(5)–(6) equivalence (see A.2.C3 for the load-bearing structural requirements).
Process guidance: MEDDEV 2.7.1 Rev 4 (Stages 0–4), updated per MDCG 2020-6 Appendix I for MDR GSPR substitution.
MDCG 2020-6 is referenced only as (i) the source of the Appendix III evidence-rank hierarchy used to tier the evidence portfolio, and (ii) the guidance under which legacy's Article 120 PMS data is appraised in the equivalence context. It is not the primary guidance for Plus's clinical evaluation.
The Class I → Class IIb "jump" is not a significant change to defend for Plus, because Plus is a new device — not a reclassified legacy. The MDCG 2020-3 significant-change analysis applies only to legacy's Article 120 status (where it is trivially satisfied: legacy has not changed).

What this removes from the CEP

Three regulatory hurdles the CEP was unnecessarily carrying under the confused dual framing drop away under Route A:

No need to defend an "up-classification significant change" for Plus per MDCG 2020-3.
No need to invoke MDR 2023/607 transition-extension provisions for Plus.
No need to treat Plus as a legacy device subject to MDCG 2020-6's narrower scope and evidence requirements.

Plus is simply a new Class IIb MDR device whose clinical evaluation follows MDCG 2020-1 and leverages legacy data through a rigorous MDCG 2020-5 equivalence demonstration.

Specific CEP edits required

Lines 787–791 — rewrite the guidance-framework paragraph. Replace the current framing (which places MDCG 2020-6 as primary guidance for legacy devices) with:
- MDCG 2020-1 as primary guidance for the clinical evaluation of Plus (MDSW three-pillar framework).
- MEDDEV 2.7.1 Rev 4 as process template, with MDCG 2020-6 Appendix I used to map MDD Essential Requirements to MDR GSPRs.
- MDCG 2020-5 as the framework for the equivalence claim to legacy.
- MDCG 2020-6 referenced only for Appendix III evidence-rank hierarchy and for the appraisal of legacy's PMS data within the equivalence context.
- MDCG 2020-3 referenced only in the context of legacy's Article 120 status (not Plus's classification).
Add a new early subsection in the CEP (placement: just before §"Clinical evaluation strategy and regulatory methodology" or equivalent) titled "Regulatory pathway for Plus and Article 120 status of the legacy device." State explicitly:
- Plus is first-CE-marked under MDR as a new Class IIb device succeeding the legacy Legit.Health.
- Legacy remains on market under MDR Article 120(3) transition provisions.
- Legacy has undergone no significant changes since MDD CE marking; a significant-change assessment per MDCG 2020-3 is held in the QMS.
- Legacy's clinical data and PMS corpus enter Plus's CER via Article 61(5)–(6) equivalence per MDCG 2020-5 (see the equivalence section, which replaces the deferred-to-CER treatment per A.2.C3).
- All Plus-vs-legacy differences are Plus-only deployments; no changes have been pushed to legacy.
Lines 31, 676, 888 — retain the "first CE-marking" language but add a one-phrase clarifier: "under MDR, as a new device succeeding the legacy Legit.Health which remains on market under Article 120(3) transition provisions."
Line 789 — delete the phrase "This is the primary MDR-era guidance for legacy device clinical evaluation." Replace with: "MDCG 2020-6 provides the evidence-rank hierarchy (Appendix III) used throughout this evaluation, and supports the appraisal of legacy's post-market surveillance data within the equivalence framework."

Cross-document consistency

The Route A framing must be consistent across:

CEP §"Regulatory pathway" (new subsection per edit 2 above)
CER §Scope and §Regulatory framework
Declaration of Conformity (r-tf-001-007-eu-doc.mdx)
STED (r-tf-sted.mdx)
IFU (apps/eu-ifu-mdr/)
Legacy significant-change assessment record (MDCG 2020-3) — standing QMS record
PMS Plan/Report for Plus (distinct from legacy's continuing PMS under Article 120)

A.2.C5 — Autoimmune and genodermatoses §6.5(e) gap coverage: triangulated-evidence strategy (keeps indications, closes the §6.4 vulnerability)

The problem. CEP lines 900–908 declare autoimmune diseases (Gap 4) and genodermatoses (Gap 5) as "acceptable gaps per MDCG 2020-6 §6.5(e)" and address them through "passive surveillance". Two defects:

§6.4 violation. Passive surveillance cannot fill pre-certification evidence gaps. MDCG 2020-6 §6.4 is explicit that clinical evidence must be sufficient prior to CE marking.
§6.5(e) over-use pattern. Five §6.5(e) declarations already in the CEP (autoimmune, genodermatoses, Fitzpatrick V–VI, Pillar 3 severity assessment, paediatric) weaken every individual declaration by association — reviewers read a five-gap pattern as indication over-ambition.

Strategic decision (2026-04-19): the manufacturer will NOT narrow the indication to exclude autoimmune or genodermatoses. Instead, a triangulated-evidence package will be generated pre-certification that re-frames the argument from "§6.5(e) acceptable gap" to "sufficient clinical evidence per MDCG 2020-6 §6.3 via multi-source triangulation, with PMCF post-market confirmation."

Why fast MRMC alone would not close §6.4

Three layered reasons MRMC-alone is insufficient (and why the triangulated package is needed):

Rank hierarchy. MRMC is Rank 11 per MDCG 2020-6 Appendix III. The CEP (line 801) and CER (§890) already acknowledge that MRMC "is not clinical data under the strict MDR Article 2(48) definition." Closing a Clinical Performance gap with evidence the document says is not clinical data is self-defeating.
Erin/Nick Round 1 precedent. BSI clinical reviewers Erin Preiss and Nick pushed back on MRMC-as-primary for dark phototypes (A.2.C6). The same logic applies uniformly — Rank 11 cannot substitute for Pillar 3 real-patient evidence on any named sub-indication.
§6.5(e) over-use. Adding an MRMC-backed sixth §6.5(e) declaration would reinforce the over-use pattern rather than resolving it.

Triangulated-evidence strategy

Re-frame the argument for autoimmune and genodermatoses from "Pillar 3 gap → MRMC fills it" (weak) to "sufficient clinical evidence under MDCG 2020-6 §6.3 via multiple rank-appropriate sources" (defensible). The package has five ingredients:

Pillar 1 — Valid Clinical Association (literature). Targeted SotA literature review (≥10 references per category group) establishing that image-based clinical recognition of autoimmune dermatoses and genodermatoses is an accepted standard in dermatology. Appended to R-TF-015-011.

Pillar 2 — Technical Performance (algorithm V&V). Per-category algorithm performance metrics (sensitivity, specificity, AUC) extracted from the existing curated image-labelling dataset, filtered for autoimmune and genodermatoses examples. Surfaced in the CER as a named Technical Performance analysis for these sub-indications.

Rank 7 — Legacy passive PMS data (Article 61(5)–(6) equivalence). Filter the legacy 4,500+ report corpus for autoimmune / genodermatoses presentations. Summarise case volumes, complaint rates, HCP feedback. Rank 7 real-world clinical evidence via the MDCG 2020-5 equivalence route (see A.2.C3, A.2.C4).

Rank 11 — Fast focused MRMC. A dedicated MRMC study on autoimmune and genodermatoses images, with a pre-specified protocol. Contributes supporting Pillar 3 §4.4 evidence: intended users (HCPs) achieve clinically relevant outputs on images representative of these sub-populations. Additive, not load-bearing.

PMCF with pre-specified thresholds. A pre-specified PMCF activity committing to post-market prospective collection of autoimmune and genodermatoses cases with pre-specified diagnostic-accuracy and user-concordance thresholds. This is PMCF confirming an adequately-evidenced base (permitted under §6.3), not filling a pre-cert gap (forbidden under §6.4). Wording matters: "confirms", "strengthens", never "fills" or "closes".

Narrowed CLAIM language (not narrowed INDICATION). Keep autoimmune and genodermatoses in the intended use. Frame the device's output for these specific sub-categories at a level the triangulated evidence supports: the device provides probability rankings within its broader ICD-11 output distribution, to be interpreted as supporting information within the HCP's differential diagnosis workup, with final diagnosis based on clinical and histopathological criteria per SotA. This is honest about what the evidence shows and does not remove anything from the indication.

§6.5(e) four-test reframing

Replace the current CEP/CER §6.5(e) declaration for these two gaps with an explicit four-test analysis:

Is the gap narrow and bounded? Yes — ~4% of presentations combined (autoimmune 3% + genodermatoses 1%).
Does the core benefit-risk conclusion depend on this gap evidence? No — the three declared benefits (7GH, 5RB, 3KX) are independently evidenced for the remaining 96% of presentations.
Is there adequate residual evidence for the claim at these sub-categories? Yes — Pillar 1 literature + Pillar 2 algorithm V&V + Rank 7 legacy PMS + Rank 11 MRMC.
Is PMCF planned to address the remaining uncertainty? Yes — PMCF Activity [A.X] with pre-specified enrolment targets and pre-specified diagnostic-accuracy / user-concordance thresholds.

The visible four-test structure distinguishes a defensible §6.5(e) declaration from a hand-wave.

Concrete execution plan

Five parallel work-streams, all executed in the new task folder apps/qms/docs/bsi-non-conformities/clinical-review/round-1/task-3b5-autoimmune-genodermatoses-triangulation/:

Pillar 2 extraction. Filter the existing curated labelling dataset for autoimmune and genodermatoses examples. Compute per-category sensitivity, specificity, AUC. Produce a table for the CER.
Legacy PMS filter. Query the legacy 4,500+ report corpus for autoimmune / genodermatoses presentations. Summarise volumes, complaints, HCP feedback. Produce a Rank 7 evidence summary.
Literature review. Targeted SotA search for AI / image-based diagnostic recognition of autoimmune dermatoses and genodermatoses. Append to R-TF-015-011. Produce a Pillar 1 VCA summary.
Fast focused MRMC. Pre-specify the protocol (CIP-style). Curate the image set (≥50–100 images per category group). Recruit reader HCPs (≥5–8). Execute the reader study. Report Rank 11 evidence with methodological rigour — a rushed MRMC with < 30 images per category or < 5 readers is worse than no MRMC and will be picked apart.
CEP/CER edits. Rewrite lines 900–908 as the four-test §6.5(e) analysis backed by the four evidence sources. Pre-specify the PMCF activity with enrolment targets and thresholds. Add the narrowed-claim language to the IFU and to the Intended Purpose reusable (packages/reusable/snippets/intendedPurpose.*) for these sub-categories. Cross-reference the PMS/PMCF plan (R-TF-007-001, R-TF-007-002).

Deliverables (produced in the new task folder):

evidence-package/pillar-1-literature.md — literature review summary
evidence-package/pillar-2-algorithm-vv.md — per-category V&V metrics table
evidence-package/rank-7-legacy-pms-filter.md — legacy PMS filter summary
evidence-package/mrmc-protocol.md — MRMC pre-specified protocol (CIP-style)
evidence-package/mrmc-results.md — MRMC execution results
four-test-rewrite.md — draft prose for the §6.5(e) four-test replacement in CEP and CER
pmcf-activity-spec.md — pre-specified PMCF activity with thresholds
narrowed-claim-language.md — proposed IFU/reusables wording

Work location: apps/qms/docs/bsi-non-conformities/clinical-review/round-1/task-3b5-autoimmune-genodermatoses-triangulation/. See that folder's CLAUDE.md for the detailed methodology and phase breakdown.

What NOT to do

Do not run a fast MRMC and declare the gap closed by itself — BSI will reject.
Do not add a sixth §6.5(e) declaration without the four-test structure.
Do not frame PMCF as "filling" or "closing" the gap — PMCF "confirms" / "strengthens" an adequately-evidenced base.
Do not under-design the MRMC because it's "just supporting" — methodological flaws in supporting evidence are worse than no supporting evidence.
Do not narrow the indication. The strategy keeps autoimmune and genodermatoses in scope.

A.2.C6 — MRMC as primary evidence for dark-phototype and rare-disease claims

Line 818 (Tier 2 rare diseases primary evidence = BI_2024 MRMC + PH_2024 MRMC); line 839 (MAN_2025 MRMC as primary for Fitzpatrick V–VI Tier 3). Document correctly acknowledges MRMC is Rank 11 and not clinical data under Article 2(48) — then contradicts itself by nominating MRMC as primary. Real-patient prospective evidence required for both tiers, or indications narrow.

MAJOR

M1 — MDCG 2020-1 Pillar 1 VCA: "per output" claim (line 127) but single SotA source for ~346 ICD-11 outputs; need VCA matrix or grouping methodology.
M2 — CRIT1-7 appraisal threshold >4 (line 729) is less than half the 0–10 scale, no rationale; needs sensitivity analysis.
M3 — Literature search (lines 662–666) only 3 queries, all on the manufacturer's own name; no SotA search, no comparator-device search, no Embase/Cochrane.
M4 — Acceptance criteria (lines 336–340, 873–882) defer derivation to two external documents; need at least one worked example per benefit group in-document.
M5 — Legacy PMS data (line 922) presented as counts only ("21 contracts, 4,500+ reports, 7 non-serious, 0 serious, 0 FSCAs") — MDCG 2020-6 §6.3 explicitly says ratios alone are insufficient; appraisal methodology needed (denominators, hazard distribution, trend analysis).
M6 — R-TF-015-012 Rank 4 classification (line 843) for physician-recall cross-sectional survey is a stretch; likely Rank 8.
M7 — Safety endpoints (lines 584–588) circular: "probability ≤ residual probability in RMF" — externally anchor to SotA baselines.
M8 — Paediatric Clinical Performance planned as "exploratory" only (lines 823–825); GSPR obligation for "all age groups" claim not discharged.
M9 — Evaluator team (lines 246–257): MEDDEV 2.7.1 §6.4 four-competence coverage underspecified; Martorell 17 vs. 10 years inconsistency.
M10 — GSPR coverage limited to 1, 8, 17 (lines 22, 263, 274, 488) — missing GSPRs 3, 4, 9 (measuring function — severity scoring!), 14.1, 18, 22.

MINOR

Terminology: "individual and residual risks" ambiguity (line 574)
Rendered-component verification: <WhatIs/>, <DeviceCharacterisation/>, <ManufacturerDetails/>, <IntendedPurpose/>, <NotUse/> (lines 290, 294, 296, 304, 381) — verify PDF export against MEDDEV 2.7.1 Annex A3 checklist before submission
Count inconsistency: "8 pivotal" headline vs. 10 rows in confirmatory table
Risk register count mismatch: 62 claimed vs. 63 rows; no category column
MDCG 2020-5 used in two contexts inconsistently (lines 139, 495)
MDCG 2023-3 in references but unused (line 174)
AI Act (Regulation (EU) 2024/1689) overreach at line 757 — state as voluntary alignment not conformity basis
Mermaid Stage 4 → Stage 0 loop has no narrative (lines 73–91)
Single literature-search date (15 July 2025, line 660) with no refresh cadence

OBSERVATIONS (positive)

Executive Summary well-structured (lines 5–31)
Explicit MDCG 2020-1 three-pillar framing (lines 803–811)
MRMC-as-Rank-11 acknowledgement (lines 801, 811, 864, 926)
CRIT1-7 appraisal framework documented (lines 709–728)
Tiered evidence structure Tier 1/2/3 (lines 815–821)
PMCF gap linkage to benefit IDs (lines 900–908)

Part B — R-TF-015-003 Clinical Evaluation Report (CER)

B.1 — Audit-deliverable leaks (audit-deliverable-reviewer)

CRITICAL leaks (block submission)

B.1.C1 — Line 37, 241: filesystem path `Investigation/man-2025/`

Two occurrences of a lowercase kebab-case repository directory reference inside MAN_2025 prose. Auditor has no such folder. Line 241 also includes "filed in the same Investigation folder as the source MRMC studies". Replace with R-TF-015-004 / R-TF-015-006 record IDs and prose description.

B.1.C2 — Lines 981, 1156, 1461: MDX filename `pms-study-report.mdx`

Three occurrences of a Docusaurus .mdx filename in prose. Replace with "companion study report annexed to R-TF-015-012".

B.1.C3 — Line 258: internal authoring-pipeline description

"rendered from the <Signature /> component on the CER page header (auto-populated from the QMS responsibility matrix and version control)" — discloses JSX/React/git machinery. Replace with regulatory-level description ("recorded in the signature block... consistent with the QMS responsibility matrix per GP-001 Annex 1").

HIGH

Line 257 — <ManufacturerDetails /> JSX tag in prose.
Line 481 — "single-source data structure... rendered... propagated via the single-source component" — engineering single-source-of-truth pattern; reword to "consistency is ensured across CER, IFU, and RMF under change control."
Line 259 — "(external — notified-body side)" internal parenthetical aside.
Line 509 — `Legit.Health Plus_IFU` underscore filename + marketing phrase "help our client integrate".

MEDIUM

Lines 1222–1223 — - [x] task-list checkboxes render as filled checkboxes in PDF; convert to definitive prose.
Line 276 — Self-referential and grammatically confused: "the clinical evaluation plan and this document is made available in the CER".
Line 98 — Stage 4 described only as "This document." — expand to full sentence.

LOW

Line 493, 515 — Brand name "Legit.Health" where convention is "the manufacturer"; "our webpage" informal.
Line 1843 — Access date "October 20, 2025" stale relative to 2026-04 CER issue date.

B.1 — Positive observations

No BSI-workflow leaks (no ticket IDs, round references, task codenames)
No Imagen/Gemini/AI-image-conversion disclosure for MAN_2025
No internal team names in prose (Andy, Gerardo, Nick, Erin, Horiana, Celine)
Filesystem boundary otherwise clean (no apps/qms/, packages/, scripts/, /tmp/)
Record ID conventions followed (GP-xxx, R-TF-xxx-xxx)
<Signature /> present at line 2638
Study codename path leak isolated to MAN_2025 (fixable surgically)

B.2 — BSI clinical auditor findings (Erin / Nick)

CRITICAL non-conformities

B.2.C1 — Pillar 3 Clinical Performance under-powered for 346 ICD-11 category scope

Lines 36, 218, 888, 1531, 2593–2595. Real-patient Pillar 3 base: MC_EVCDAO_2019 (n=105, melanoma-enriched, specialist, QUADAS-2 HIGH x2), COVIDX (n=160, primary endpoint CUS 7.66 vs. target 8 — not met), DAO_O (n=117 of 127 planned, single site), DAO_PH (n=131, primary comparative endpoint compromised by major protocol deviation), IDEI_2023 (mixed prospective/retrospective, 33% exclusion), AIHS4_2025 (n=2), NMSC_2025 (n=135, 80% malignancy prevalence, specialist H&N clinic).

Line 2077 and 909 use MC_EVCDAO_2019's 0.85 AUC as "the global device AUC for melanoma" — cannot generalise from one specialist-enriched n=105 study. Either narrow indications (autoimmune, vascular, paediatric, Fitzpatrick V–VI out of primary claim; move to "investigated under PMCF") or generate pre-cert Pillar 3 evidence. PMCF cannot close this per MDCG 2020-6 §6.4.

B.2.C2 — MDCG 2020-6 §6.5(e) "acceptable gap" over-used

Lines 38, 172–173, 239, 533, 1206, 2186–2188, 2322, 2326–2328. Five gaps declared acceptable: autoimmune (3%), genodermatoses (1%), Fitzpatrick V–VI, severity assessment Pillar 3 across PASI/UAS/SCORAD (Gap 2), paediatric. Gap 2 specifically is not valid under §6.5(e) — Benefit 5RB cannot be claimed achieved pre-market while the Pillar 3 evidence for three of four severity scales is absent. Each retained gap needs the structured §6.5(e) test: (i) feasibility of pre-market evidence, (ii) independence from core benefit-risk conclusion, (iii) acceptable-risk bound.

B.2.C3 — R-TF-015-012 legacy RWE over-weighted as Rank 4

Lines 27, 875, 967–981, 1147, 1459–1513, 1531, 2565, 2571, 2577. The study is a cross-sectional retrospective-recall physician-survey (n=60, Google Form, 64.1% physician estimate without record consultation). The Appendix III "high quality surveys may also fall into this category" note does not elevate a 60-respondent physician-recall survey to Rank 4. Reclassify quantitative endpoints to Rank 8 (proactive PMS data / professional opinion) and remove the "confirmed in routine clinical practice" framing at lines 2565, 2571, 2577.

B.2.C4 — "Eight pivotal clinical investigations" miscategorises MRMC studies

Lines 27, 35, 1382–1383, 2151, 2155. MRMC studies (BI_2024, PH_2024, SAN_2024) are not clinical investigations per MDR Article 2(45) — the document acknowledges this correctly at line 890. Executive summary must say: "six prospective pivotal clinical investigations, three multi-reader multi-case simulated-use reader studies (Rank 11), one retrospective analysis of a third-party trial (AIHS4_2025, n=2)." Arithmetic: 719 real patients from prospective, not "over 800."

B.2.C5 — Class I → Class IIb equivalence justification inadequate

Lines 1226–1322. Technical equivalence asserted without enumerating: AI model weights per version, ICD-11 category coverage (~1000 vs. 346), novel six binary safety indicators (line 429), novel P₂=1 severity-prioritisation constraint (line 393), severity-scale additions (APASI/AUAS/AIHS4/SCORAD), DIQA threshold calibration. Line 1316 references legacy performance evidence "held in the legacy MDD Technical File" — must be summarised in the CER per Erin: "If the data is in another document but not referenced or summarised in the CER, it does not exist for my assessment."

MAJOR

M1 — Acceptance criteria margins (lines 2258, 2267, 287): +12.2 percentage points from SotA 0.778 to target 0.90 for Multiple Malignant (pooled); 3–23 pp range across domains undocumented; potential ex-post calibration risk.
M2 — COVIDX endpoint reconciliation: line 920 says "Met", line 925 says "did not achieve... mainly due to an outlier", line 2035 marks "❌ Not met" — three statements about the same endpoint.
M3 — DAO_PH inconsistency: line 1634 regulatory table says "No significant deviations"; line 962 says "Major protocol deviation"; benefit 3KX(b) reduced to single-site evidence.
M4 — Fitzpatrick V–VI: four incoherent positions (§6.5(e) gap; MAN_2025 Rank 11 strengthening; field-wide limitation per Tjiu/Lu 2025; mixed external evidence). Collapse to single position.
M5 — PMCF per-activity methodology (lines 2498, 2510–2527) pointed to R-TF-007-002; must be inline in CER.
M6 — Safety objectives table (lines 2413–2419): "0 cases" treated as safety achievement; apply rule-of-three upper 95% bound per row.
M7 — PMS zero-incidents over-cited (lines 29, 1152, 1440–1451); MDCG 2020-6 §6.3 says ratios alone insufficient.
M8 — Annex A7.3 PPV/NPV (lines 2288–2296): melanoma only from MC_EVCDAO_2019; extend to BCC/cSCC setting × condition matrix.
M9 — Performance Claims external doc (line 2302): 148 claims pointed externally; inline as CER annex.
M10 — Evaluator qualifications (lines 2627–2634): only Dr. Martorell has medical knowledge; three others "Limited"; 346 indications need broader clinical coverage (dermatopathology, paediatric derm).

MINOR

Novel P₂=1 constraint (lines 391–393) needs dedicated evidence subsection
Acronym table (lines 61–84) missing CCO, MRMC, DIQA, CUS, ICC, MCID, PCP, ViT, FHIR, EMR, HCP, ITP
Line 14 typo: "MEDDEV guideline 2. 7/1 revision 4"
Lines 129, 1631: unicode \u00f3 literal rendering
Lines 1867–1882: section concludes "no residual risks" while document identifies 8 + 3 acceptable gaps + paediatric gap + F1 safety signal (31.7% misleading output rate at line 1509)
Line 2157: "24 independent expert dermatologists" likely reader-assignments not unique persons
Lines 1825, 1914, 2537: SUS 82.5 human-factors data blurred into clinical performance
Line 2247: CUS ≥ 8 threshold cross-scale heuristic (SUS 70.1 → 6.8/10), not formal derivation
Line 29: "500 practitioners / 1,000 patients" lacks as-of date and counting methodology
Lines 2599–2607: annual update cadence unjustified for AI/ML drift

OBSERVATIONS (positive)

"How to read this CER" traceability chain (lines 86–176)
API-output vs. physician-interface framing (lines 417–429) with Top-1/Top-3/Top-5 rationale
MDCG 2020-1 three-pillar + Rank 11 MRMC framing correct (line 890)
Six comparator pathways enumerated (lines 823–834)
QUADAS-2 / MINORS design-appropriate application (lines 1047–1143); HIGH bias ratings disclosed
MDCG 2020-13 I/J/K not-applicable statements clean (lines 184–188)

Consolidated priority remediation order

Must fix before any export to BSI

CEP A.1.C1 — Fix corrupted MDX \_..._* patterns (lines 808, 849, 853) and restore APASI_2025 / NMSC_2025 underscores
CEP A.1.C2 — Reconcile commercialisation-status contradiction at line 286
CER B.1.C1 — Remove Investigation/man-2025/ path leak (lines 37, 241)
CER B.1.C2 — Remove pms-study-report.mdx filename leak (lines 981, 1156, 1461)
CER B.1.C3, H1 — Remove <Signature /> and <ManufacturerDetails /> JSX disclosures (lines 257, 258)
CEP A.1.C3 — Strip "AI Labs Group S.L." from line 875 study objective (QMS style rule — legal entity already in <ManufacturerDetails />). Line 665 literature search query is fine as-is.

Must fix for Round 2 defensibility (Erin/Nick would not close)

CEP A.2.C1 — Re-cast retrospective Plan as either versioned advance Plan or explicit legacy-transition framing
CEP A.2.C2 — Classification is correctly IIb; fix three prose weaknesses: (i) replace generic CEP rationale (lines 350–359) with a pointer to the authoritative MDCG 2019-11 matrix defense; (ii) remove "err on the side of caution" voluntary-choice framing (lines 133, 185 of r-tf-description-and-specification.mdx); (iii) drop the "0.03% of categories" argument (line 179) and scope the IMDRF 5.2.2 analysis to the 345 non-melanoma categories. Then run cross-document consistency check across CEP / CER / DoC / STED / IFU / RMF / GSPR / labeling.
CEP A.2.C3 — Equivalence is load-bearing under Route A (legacy data enters Plus's CER only via Article 61(5)–(6) equivalence). Enumerate every Plus-vs-legacy difference into Bucket A (architectural/deployment, clinically irrelevant), Bucket B (same algorithm, verified output parity), or Bucket C (novel Plus features requiring Plus-specific evidence — at minimum: six malignancy-surfacing safety indicators, P₂=1 constraint, any Plus-only severity scales, any Plus-only ICD-11 categories). Inline a summary MDCG 2020-5 equivalence table in the CEP with per-difference clinical-relevance assessment. Add a dedicated "Plus features not covered by equivalence" section. State Article 61(5)(b) access condition explicitly. Replace the CER's "Same" Critical Performance Requirements assertion with citations to legacy study reports.
CEP A.2.C4 — Route A is the confirmed regulatory pathway: Plus is first-CE-marked under MDR as a new device; legacy stays under Article 120(3) with clean MDCG 2020-3 significant-change status (no changes pushed to legacy). Rewrite CEP lines 787–791 so MDCG 2020-1 is primary guidance, MEDDEV 2.7.1 Rev 4 is process template, MDCG 2020-5 is equivalence framework, and MDCG 2020-6 is only referenced for Appendix III evidence-rank hierarchy. Add a new "Regulatory pathway" subsection making Route A explicit. Delete line 789's "primary MDR-era guidance for legacy device clinical evaluation" phrase. Add clarifiers to lines 31, 676, 888. File a standing MDCG 2020-3 significant-change assessment for legacy as a QMS record.
CEP A.2.C5 / CER B.2.C2 — Autoimmune / genodermatoses: keep indications; build triangulated-evidence package (Pillar 1 literature + Pillar 2 algorithm V&V + Rank 7 legacy PMS + Rank 11 focused MRMC + pre-specified PMCF with thresholds + narrowed claim language). Rewrite §6.5(e) declaration as explicit four-test analysis. Work location: task-3b5-autoimmune-genodermatoses-triangulation/. Fitzpatrick V–VI addressed separately per A.2.C6 (MAN_2025 MRMC plus real-patient evidence or narrowed scope).
CEP A.2.C6 / CER B.2.C1 — Resolve MRMC-as-primary contradiction; generate real-patient evidence or narrow scope
CER B.2.C3 — Reclassify R-TF-015-012 from Rank 4 to Rank 8; remove "confirmed in routine clinical practice" framing
CER B.2.C4 — Replace "eight pivotal clinical investigations" with accurate six + three MRMC + one retrospective compound description
CER B.2.C5 — Produce legacy → Plus per-change table (model weights, ICD coverage, novel safety indicators, P₂=1 constraint, severity scales, DIQA calibration)

Should fix to reduce follow-up questions

CEP Major findings M1–M10
CER Major findings M1–M10

Polish

CEP Medium/Low findings (date formats, grammar, unit consistency)
CER Medium/Low findings (checkbox rendering, brand-name usage, stale dates)

Cross-document traceability observations

CEP and CER both treat MAN_2025 as closing the Fitzpatrick V–VI gap but inconsistently (CEP: primary evidence Tier 3; CER: four conflicting positions).
CEP claims "8 pivotal investigations" — CER inherits the same miscount.
CEP acceptance-criteria derivation pointed to R-TF-015-011; CER pointed to R-TF-015-011 + R-TF-015-003 — circular.
Gap numbering in CEP (Gap 1–5) does not match Gap numbering in CER (Gaps 1–5 but different content mapping at lines 2322/2326/2328).
Legacy device framing: CEP says MDD Class I; CER claims equivalence at Critical Performance Requirements level without per-version evidence.

Agent run metadata

Agent	Target	Runtime
audit-deliverable-reviewer	CEP	230 s, 16 tool uses
audit-deliverable-reviewer	CER	233 s, 54 tool uses
bsi-clinical-auditor	CEP	336 s, 13 tool uses
bsi-clinical-auditor	CER	379 s, 14 tool uses

Part A — R-TF-015-001 Clinical Evaluation Plan (CEP)​

A.1 — Audit-deliverable leaks (audit-deliverable-reviewer)​

CRITICAL leaks (block submission)​

HIGH​

MEDIUM​

LOW​

A.2 — BSI clinical auditor findings (Erin / Nick)​

CRITICAL non-conformities​

Step 1 — Enumerate the complete legacy→Plus change set, categorised into three buckets​

Step 2 — Inline a summary equivalence table in the CEP​

Step 3 — Explicitly carve out novel Plus features​

Step 4 — State the Article 61(5)(b) access condition​

Step 5 — Summarise legacy's Critical Performance Requirements with citations​

Implications of Route A for the regulatory pathway​

What this removes from the CEP​

Specific CEP edits required​

Cross-document consistency​

Why fast MRMC alone would not close §6.4​

Triangulated-evidence strategy​

§6.5(e) four-test reframing​

Concrete execution plan​

What NOT to do​

MAJOR​

MINOR​

OBSERVATIONS (positive)​

Part B — R-TF-015-003 Clinical Evaluation Report (CER)​

B.1 — Audit-deliverable leaks (audit-deliverable-reviewer)​

CRITICAL leaks (block submission)​

HIGH​

MEDIUM​

LOW​

B.1 — Positive observations​

B.2 — BSI clinical auditor findings (Erin / Nick)​

CRITICAL non-conformities​

MAJOR​

MINOR​

OBSERVATIONS (positive)​

Consolidated priority remediation order​

Must fix before any export to BSI​

Must fix for Round 2 defensibility (Erin/Nick would not close)​

Should fix to reduce follow-up questions​

Polish​

Cross-document traceability observations​

Agent run metadata​

Part A — R-TF-015-001 Clinical Evaluation Plan (CEP)

A.1 — Audit-deliverable leaks (audit-deliverable-reviewer)

CRITICAL leaks (block submission)

HIGH

MEDIUM

LOW

A.2 — BSI clinical auditor findings (Erin / Nick)

CRITICAL non-conformities

Step 1 — Enumerate the complete legacy→Plus change set, categorised into three buckets

Step 2 — Inline a summary equivalence table in the CEP

Step 3 — Explicitly carve out novel Plus features

Step 4 — State the Article 61(5)(b) access condition

Step 5 — Summarise legacy's Critical Performance Requirements with citations

Implications of Route A for the regulatory pathway

What this removes from the CEP

Specific CEP edits required

Cross-document consistency

Why fast MRMC alone would not close §6.4

Triangulated-evidence strategy

§6.5(e) four-test reframing

Concrete execution plan

What NOT to do

MAJOR

MINOR

OBSERVATIONS (positive)

Part B — R-TF-015-003 Clinical Evaluation Report (CER)

B.1 — Audit-deliverable leaks (audit-deliverable-reviewer)

CRITICAL leaks (block submission)

HIGH

MEDIUM

LOW

B.1 — Positive observations

B.2 — BSI clinical auditor findings (Erin / Nick)

CRITICAL non-conformities

MAJOR

MINOR

OBSERVATIONS (positive)

Consolidated priority remediation order

Must fix before any export to BSI

Must fix for Round 2 defensibility (Erin/Nick would not close)

Should fix to reduce follow-up questions

Polish

Cross-document traceability observations

Agent run metadata