Research and planning
This page is for internal planning only. It will not be included in the final response to BSI.
Analysis
What BSI is asking: The CER's device description and intended purpose sections are not sufficiently clear or comprehensive to serve as a standalone document. BSI's clinical reviewer (Erin Preiss) highlights three specific gaps:
- What is the list of possible ICD categories relevant to the device? — BSI wants to see the complete list (or at least a structured summary) of the ICD-11 categories the device covers.
- What are the specific indications, including the comprehensive list of malignant/high-risk diseases? — BSI wants explicit enumeration of which malignant and high-risk conditions fall within the device's scope, not a generic statement about "diseases of the skin."
- What are the specific outputs of the device? — BSI wants a clear, unambiguous description of what the device actually produces (probability distributions, severity scores, clinical sign measurements, etc.).
Critical context from BSI (Item 2 index.mdx): "The Clinical Evaluation and the Clinical Evaluation Report should be considered standalone, and may be used in future reviews without supporting documents, therefore, there is a necessity for clear description of the following components within the CER."
What regulations are at stake
-
MDR Annex II, Section 1.1: Requires "a general description of the device including its intended purpose and intended users; the intended patient population and medical conditions to be diagnosed, treated and/or monitored and other considerations such as patient selection criteria, indications, contra-indications, warnings; principles of operation of the device and its mode of action, scientifically demonstrated if necessary" — all presented "in a clear and unambiguous manner."
-
MDR Annex XIV, 1(a), sub-bullets 2 and 3: The Clinical Evaluation Plan must include "a clear specification of the intended purpose of the device, including the intended medical indication or indications and intended patient population or populations" and "a clear specification of the intended target groups and the intended clinical benefits to be achieved."
Underlying concern: BSI cannot understand the scope of the device from the CER alone. The CER uses shared React components (<IntendedPurpose />, <MedicalConditions />, <DeviceCharacterisation />) which render correctly in the built site but whose actual text content is not visible in the CER source document. More fundamentally, even the rendered content uses broad language ("diseases of the skin", "ICD-11 code 14") without specifying which specific conditions — particularly malignant ones — the device covers.
Relevant QMS documents and sections
| Document | Path | Relevance |
|---|---|---|
| R-TF-015-003 CER, "Device description" section | apps/qms/docs/.../R-TF-015-003-Clinical-Evaluation-Report.mdx (lines 170-189) | Uses <DeviceCharacterisation />, <IntendedPurpose />, <MedicalConditions />, <NotUse /> components. Renders correctly but does not contain explicit text in the CER source. BSI likely reviewed the rendered PDF/HTML and found it insufficiently specific. |
| R-TF-015-001 CEP, "Description" section | apps/qms/docs/.../R-TF-015-001-Clinical-Evaluation-Plan.mdx (line 228+) | Same shared components. Also uses generic language. |
| R-TF Device Description and Specification | apps/qms/docs/.../r-tf-description-and-specification.mdx | The most detailed device description document. States 239 ICD-11 categories (line 281), output specifications (lines 279-283), classification details. But it also relies on shared components for intended purpose and medical conditions. |
| R-TF STED | apps/qms/docs/.../r-tf-sted.mdx | Summary technical documentation. References same components. |
| R-TF-028-004 ICD-11 Mapping | apps/qms/docs/.../r-tf-028-004-data-annotation-instructions-icd-11-mapping.mdx | Documents the 239 ICD-11 category mapping methodology. Contains the formal process for mapping clinical labels to ICD-11 codes. References a master spreadsheet with all mappings. Does not contain the full list of 239 categories inline. |
| en.json (translations) | packages/reusable/translations/en.json | Contains the actual text rendered by shared components. Key content: intended purpose, medical conditions (ICD-11 code 14), indication for use, device outputs (probabilistic distribution, severity measurement). |
| IntendedPurpose.mdx (component) | packages/reusable/snippets/IntendedPurpose.mdx | Renders: intended use, device description, quantification capabilities (36 clinical signs listed), ICD recognition, medical indication, patient population, intended users, use environment, operating principle, body structures, explainability. |
| MedicalConditions.mdx (component) | packages/reusable/snippets/MedicalConditions.mdx | Renders: ICD-11 code 14 reference, probabilistic distribution explanation, indications for use (severity measurement + diagnostic support). |
| Clinical benefits table in CEP | apps/qms/docs/.../R-TF-015-001-Clinical-Evaluation-Plan.mdx (lines 281-289) | 7 clinical benefits with IDs (7GH, 3KX, 8PL, 1QF, 9VW, 5RB, 0ZC), means of measure, and magnitude of benefit claimed. Benefit 1QF specifically addresses malignancy detection (melanoma, malignant conditions). |
| Clinical benefits section in CER | apps/qms/docs/.../R-TF-015-003-Clinical-Evaluation-Report.mdx (line 309) | References "Performance Claims & Clinical Benefits" document. Lists 7 benefits by name. |
| M1.Q1 fyi: ICD distribution vs diagnosis | apps/qms/docs/bsi-non-conformities/technical-review/round-1/m1-diagnostic-function/q1-ifu-performance-claims/fyi/icd-distribution-vs-diagnosis.md | Essential reading. Catalogues 51 locations across 20+ documents where the "distribution not diagnosis" argument is made. Any CER update about device outputs must be consistent with this established position. |
| M1.Q1 fyi: Clinical evidence ICD distribution rationale | apps/qms/docs/bsi-non-conformities/technical-review/round-1/m1-diagnostic-function/q1-ifu-performance-claims/fyi/clinical-evidence-icd-distribution-rationale.md | Essential reading. Explains why performance claims are not condition-specific: the device always outputs a full distribution across all categories; clinical benefits measure HCP improvement, not device diagnosis. Uses "346" categories throughout (vs "239" in Device Description — reconciliation needed). Contains the device-vs-diagnostic-test comparison table that must be mirrored in any CER device output description. |
| M1.Q1 fyi: Question for Jordi | apps/qms/docs/bsi-non-conformities/technical-review/round-1/m1-diagnostic-function/q1-ifu-performance-claims/fyi/question-for-jordi.mdx | Flags two studies likely to surface in the clinical review: (1) MC_EVCDAO_2019 conducted with legacy device — needs bridging argument; (2) AIHS4 2025 with n=2 — undermines evidence credibility. Not directly relevant to Item 2a but will affect Items 2b, 3a, and 3b. |
| EU IFU MDR, clinical benefits | apps/eu-ifu-mdr/versioned_docs/version-1.1.0.0/clinical-user-manual/clinical-benefits-and-performance-claims.mdx | States: "The device always outputs a probability distribution across all validated ICD-11 categories for every image processed." |
| r-tf-description-and-specification.mdx output specs | apps/qms/docs/.../r-tf-description-and-specification.mdx (lines 279-283) | "ICD Categories: Probabilistic distribution across 239 ICD-11 categories; Clinical Signs: Quantitative measurements (intensity 0-10 scale, count, extent in cm² or %); Response Time: Typically < 5 seconds" |
Gap analysis
What we have
The information BSI is requesting exists in our documentation, but is scattered across multiple documents and shared components:
-
ICD categories: The device covers 239 ICD-11 categories under code 14 (Diseases of the skin). This number is stated in R-TF Device Description (line 281) and the mapping methodology is documented in R-TF-028-004. The actual list of 239 categories lives in a master spreadsheet referenced by R-TF-028-004, and also in data files used by the AI development records.
-
Device outputs are well-documented across several sources:
- Probabilistic distribution across all 239 ICD-11 categories (en.json: "The device outputs a probabilistic distribution across all ICD-11 categories related to skin diseases")
- Severity measurement: intensity (0-10 scale), count, and extent (cm² or %) of 36 clinical signs (IntendedPurpose.mdx lists all 36)
- Explainability data: bounding boxes for count-based signs, masks for extent-based signs
- Response time: typically < 5 seconds per image
- The IFU clearly states: "The device always outputs a probability distribution across all validated ICD-11 categories for every image processed. The device does not diagnose specific conditions."
-
Indications: The formal indication statement (en.json,
intendedMedicalIndication.content) covers "all diseases of the skin incorporating conditions affecting the epidermis, its appendages and associated mucous membranes, the dermis, the cutaneous vasculature and the subcutaneous tissue." -
Malignancy-specific information: Benefit 1QF in the CEP specifically addresses "lesions suspicious for skin cancer" with detailed metrics for malignancy detection (AUC, sensitivity, specificity) and melanoma detection. The device description mentions classification as Class IIb specifically because of "its application in melanoma detection."
What's missing (BSI's concern is valid)
BSI's core complaint is that the CER is not standalone. Specifically:
-
No list of ICD-11 categories in the CER. The CER says the device addresses "ICD-11 code 14 - Diseases of the skin" and mentions "239 ICD-11 categories" nowhere. Even the rendered shared components only say "ICD-11 code 14" with a link to the WHO browser. An auditor reviewing the CER cannot see which specific conditions are covered without navigating to R-TF-028-004 or the data files.
-
No specific enumeration of malignant/high-risk diseases. The CER mentions "melanoma" in passing (equivalence table, clinical studies) but never provides a structured list of which malignant conditions the device can recognize. BSI wants to know: which cancers? which precancerous conditions? The clinical studies reference "malignancy detection" and "AUC detecting malignancy" but don't define what "malignancy" means in this context (which ICD-11 categories are classified as malignant).
-
Device outputs not structured clearly. The CER's device description section renders the shared
<IntendedPurpose />component, which includes output information. But the rendered text describes outputs narratively ("interpretative distribution representation of possible ICD categories") rather than with a structured, unambiguous specification (e.g., "Output 1: probability array across N categories; Output 2: severity score on 0-10 scale for M clinical signs"). -
Generic indication statement. "All diseases of the skin" is technically accurate but unhelpfully broad for a clinical reviewer. BSI wants to understand the device's diagnostic scope — what diseases can it actually distinguish? This connects to the 239-category list.
Root cause: The CER was designed as part of a documentation ecosystem where shared components ensure consistency across documents. This is good engineering but bad for CER standalone readability. Additionally, the CER inherits the generic language of the shared components ("ICD-11 code 14") without adding CER-specific detail about the device's actual diagnostic scope.
Response strategy
Approach: Enhance the CER device description section with explicit, structured content
The CER needs to contain within its own text (not just via shared components) a clear, comprehensive device description that answers BSI's three questions. This means adding content to the CER while keeping the shared components for consistency.
Fixes required in R-TF-015-003 (CER)
The CER device description section (around lines 170-189) needs enhancement with the following additions:
-
Structured device output specification: Add a clear subsection or table to the CER specifying:
- Output 1: Probabilistic distribution across 239 validated ICD-11 categories (array of category-probability pairs)
- Output 2: Quantitative clinical sign measurements — intensity (0-10 scale), count (integer with bounding boxes), extent (cm² or % with segmentation masks) — for 36 clinical signs
- Note: the device does NOT output a diagnosis, recommendation, or binary result
- Reference the IFU statement: "The device always outputs a probability distribution across all validated ICD-11 categories for every image processed."
-
ICD-11 category summary: Add a subsection that:
- States the total number of validated ICD-11 categories (239)
- Provides a structured grouping of these categories (e.g., by ICD-11 chapter/block: inflammatory, neoplastic, infectious, etc.)
- Explicitly lists the malignant/high-risk categories covered (e.g., melanoma, basal cell carcinoma, squamous cell carcinoma, actinic keratosis, etc.)
- References R-TF-028-004 for the complete mapping methodology
-
Indication clarification: Strengthen the indication statement by specifying that:
- The device is indicated for use on images of visible skin abnormalities
- The diagnostic scope covers 239 ICD-11 categories under code 14 (Diseases of the skin)
- The device's two functions are: (a) probability distribution across ICD categories and (b) severity quantification of clinical signs
- Specific mention of malignancy detection capability with reference to clinical benefit 1QF
Consideration: Where does the 239-category list live?
The full list of 239 ICD-11 categories is maintained in the master spreadsheet referenced by R-TF-028-004. Options:
- Option A: Include the full 239-row table in the CER (makes it truly standalone but extremely long)
- Option B: Include a structured summary (grouped by ICD-11 block) with explicit listing of malignant/high-risk categories, and reference R-TF-028-004 for the complete list
- Option C: Include the full list as an annex to the CER
Recommendation: Option B. A 239-row table would overwhelm the CER. BSI's specific concern is about clarity of the malignant/high-risk categories and device scope — a structured summary with explicit malignancy listing addresses this directly. The full list can be referenced and provided as supplementary evidence.
In the response to BSI
-
ICD categories: State that the device covers 239 validated ICD-11 categories under code 14 (Diseases of the skin). Provide a structured summary grouped by clinical domain. Explicitly list the malignant/high-risk categories. Reference R-TF-028-004 for the complete mapping and R-TF Device Description for output specifications.
-
Specific indications: Clarify that the indication is "all visible skin abnormalities" with the device covering 239 ICD-11 categories. Specifically address malignant conditions: the device outputs a probability distribution that includes malignant categories (melanoma, BCC, SCC, actinic keratosis, etc.). Clinical benefit 1QF specifically validates malignancy detection performance.
-
Device outputs: Provide a structured description: (a) probability array across 239 ICD-11 categories, (b) quantitative severity measurements for 36 clinical signs (intensity 0-10, count with bounding boxes, extent in cm²/%), (c) explainability data. Emphasize: the device does NOT provide a binary diagnosis.
-
State that the CER has been updated with explicit device description content to ensure standalone readability per BSI's guidance.
-
Provide red-lined CER documentation.
Critical guardrails from M1.Q1 fyi documents: Before writing any CER updates, read the two fyi documents from the technical review:
icd-distribution-vs-diagnosis.md— ensures we don't accidentally frame the device as a diagnostic tool in the CERclinical-evidence-icd-distribution-rationale.md— ensures we don't add condition-specific claims that contradict 51 other locations in the technical file
Any description of "specific indications" or "malignant conditions" must be framed as categories within the probability distribution that have been validated in clinical studies measuring HCP improvement — never as conditions the device "diagnoses" or "detects" independently.
Confidence level: Medium. BSI's concern is valid — the CER genuinely lacks specificity in its device description. The fix requires substantive content additions (not just rewording). The main risk is:
- We need to produce the actual list of malignant/high-risk ICD-11 categories from the master spreadsheet. This requires collaboration with JD-009 / JD-022 to confirm which of the 239 categories are considered malignant/high-risk.
- The 239-category scope is broad. BSI may question whether all 239 categories have been clinically validated (they have been — the clinical studies cover the full distribution — but this needs to be articulated carefully to avoid opening new lines of questioning about per-category validation).
Key research findings
Finding 1: The device covers 239 ICD-11 categories
R-TF Device Description, line 281: "ICD Categories: Probabilistic distribution across 239 ICD-11 categories." These 239 categories are derived from the LegitHealth-DX dataset through the mapping process documented in R-TF-028-004. The mapping consolidates visually indistinguishable conditions into single "Visible ICD-11 category" targets (e.g., contact and atopic dermatitis merged into "Eczematous dermatitis").
Finding 2: Device outputs are well-defined but scattered
Three output types are documented:
- ICD probability distribution: Array of category-probability pairs across all 239 categories (en.json: "probabilistic distribution across all ICD-11 categories")
- Clinical sign measurements: Intensity (0-10), count (integer), extent (cm²/%) for 36 signs listed in IntendedPurpose.mdx (erythema, desquamation, induration, crusting, xerosis, swelling, oozing, excoriation, lichenification, exudation, wound depth, wound border, undermining, hair loss, necrotic tissue, granulation tissue, epithelialization, nodule, papule, pustule, cyst, comedone, abscess, hive, draining tunnel, non-draining tunnel, inflammatory lesion, exposed wound/bone, slough/biofilm, maceration, external material, hypopigmentation/depigmentation, hyperpigmentation, scar, scab, spot, blister)
- Explainability: Bounding boxes (count) and segmentation masks (extent)
Finding 3: Malignancy detection is a specific clinical benefit
Clinical benefit 1QF (CEP lines 286-294) specifically validates "lesions suspicious for skin cancer" with metrics for both general malignancy detection (AUC ≥ 90%, sensitivity 79%, specificity 87%) and melanoma specifically (AUC 85%, sensitivity 93%, specificity 80%, accuracy 81%). The CER also classifies the device as Class IIb specifically because of melanoma detection implications.
Finding 4: CER uses shared components — good for consistency, bad for standalone
The CER's device description section (lines 170-189) consists almost entirely of shared component calls: <ManufacturerDetails />, <DeviceCharacterisation />, <IntendedPurpose />, <NotUse />. These render identically across CER, CEP, IFU, and STED — ensuring consistency. However, they contain generic language suitable for all contexts, not the CER-specific detail BSI expects. The fix must add CER-specific content without duplicating or contradicting the shared components.
Cross-NC connections
Connection to Technical Review M1.Q1 (IFU Performance Claims) — CRITICAL
Item 2a (clinical review) and M1.Q1 (technical review) are asking the same underlying question from different angles: "What conditions does the device cover and what does it output?" M1.Q1 asks from the IFU user's perspective (GSPR 15.1/23.4); Item 2a asks from the CER standalone perspective (Annex II, Annex XIV). Both responses are going to the same notified body (BSI) and must be perfectly consistent. Any contradiction between the IFU and the CER will generate a new non-conformity.
Key alignment points:
-
Device function vs clinical benefit distinction (M1.Q1 lines 76-93). M1.Q1 established a critical framing: the device has a uniform output mechanism (identical probability distribution across all validated ICD-11 categories for every image, no condition-specific mode) but context-dependent clinical benefits (HCPs derive more benefit in areas where their baseline accuracy is lower, e.g., rare diseases or malignancy detection). Item 2a's description of device outputs and indications must use the same framing. If we describe the device as having "specific indications" for malignant conditions, BSI will read that against the IFU's statement that there is no condition-specific device function and conclude we are contradicting ourselves.
-
ICD-11 codes must NOT be tied to specific claims (M1.Q1 line 96). M1.Q1 explicitly decided against adding ICD-11 codes per performance claim because it would imply condition-specific performance, contradicting the device's regulatory position (distributional output per MDCG 2020-1). Item 2a's CER fix should describe the full set of ICD-11 categories but must NOT imply per-category diagnostic capability. The correct framing: the device produces a distribution across ALL categories; some categories include malignant conditions; clinical studies validate the HCP's improved accuracy when using this distributional output in specific clinical contexts.
-
Number of ICD-11 categories — reconciliation needed. M1.Q1 research-and-planning line 82 refers to "346 validated ICD-11 categories." The Device Description (r-tf-description-and-specification.mdx, line 281) says "239 ICD-11 categories." These numbers must be reconciled before either response is finalized. Possible explanations: (a) 346 is the total ICD-11 codes mapped (including multiple codes per visible category), while 239 is the number of "Visible ICD-11 category" targets after consolidation (R-TF-028-004 merges visually indistinguishable conditions); (b) the number changed between documentation versions. Action required: confirm the correct number and ensure both the IFU and CER state the same figure.
-
IFU fixes already completed in M1.Q1. M1.Q1 has already added to the IFU: (a) clinical studies reference section with full bibliographies, (b) clinical benefit glossary/legend table, (c) "How to Read the Performance Claims" section explaining the distributional output, (d) study-to-benefit cross-reference table, (e) clarifying text distinguishing device function from clinical benefit. The CER must be consistent with all of these. In particular, the CER's device output description should use the same language as the IFU's "How to Read the Performance Claims" section.
-
Response tone alignment. M1.Q1's research notes that we should NOT cite MDR Rule 11 or MDCG 2020-1 proactively. Item 2a should follow the same principle — describe the device clearly without over-justifying the regulatory classification.
Impact on response strategy: When writing the CER updates for Item 2a, we must:
- Use the exact same ICD-11 category count as the IFU (after reconciliation)
- Describe device outputs consistently with the IFU's "How to Read the Performance Claims"
- Frame malignant conditions as categories within the distribution that have been specifically validated in clinical studies (benefit 1QF), not as separate device functions
- Cross-reference the IFU's updated Clinical Benefits section as evidence of GSPR 15.1 compliance
Connection to Item 2b (Clinical Benefits, Performance & Safety vs SotA)
Item 2a (this item) and Item 2b are two parts of the same deficiency. Item 2a asks about device description/intended purpose/indications; Item 2b asks about clinical benefits, performance outcomes, and SotA comparison. Both stem from BSI's inability to understand the device's scope from the CER alone. The fixes should be coordinated:
- Item 2a: defines WHAT the device does (outputs, ICD categories, indications)
- Item 2b: defines HOW WELL the device performs (clinical benefits, acceptance criteria, SotA comparison)
Connection to Item 3 (Clinical Data)
Item 3 asks about clinical data analysis and data sufficiency. BSI's ability to assess data sufficiency depends on understanding the device's scope (Item 2a). If Item 2a is not resolved first, Item 3 cannot be properly evaluated. The response to Item 2a should be written with awareness that Item 3 will build on it.
Potential weaknesses (BSI auditor perspective)
These concerns were identified through a critical review from the BSI auditor's perspective.
Medium risk: "239 categories" invites per-category validation questions
Stating that the device covers 239 ICD-11 categories may prompt BSI to ask: "Show me clinical validation data for each of these 239 categories." The clinical studies validate the device's overall diagnostic accuracy across the full distribution, not per-category. The response must frame this correctly: the device outputs a distribution across all categories for every image — it doesn't "diagnose" individual categories. Per-category performance is an AI development metric (documented in R-TF-028-005 AI Development Report), not a clinical validation metric. However, this distinction is subtle and BSI may push on it in Round 2.
Medium risk: Malignant category enumeration requires data extraction
We need the actual list of which ICD-11 categories within the 239 are classified as malignant/high-risk. This information exists in the master spreadsheet but may not be easily extractable without JD-009 input. The response cannot be vague — BSI specifically asked for "the comprehensive list of the specific malignant/high risk diseases."
Low risk: Shared components may render differently in PDF vs HTML
If BSI reviewed a PDF export of the CER, the shared components may have rendered with formatting issues. The fix adds explicit text content, reducing reliance on component rendering.