Message for Alfonso: R-TF-028-011 updates needed

BLOCKER — Item 7 response describes changes that do not exist

The Item 7 response (response.mdx) says "We have updated R-TF-028-011 to document the explicit rationale..." and then describes detailed changes across all four of BSI's concerns. None of these changes exist in the actual files. If we submit the response as-is, BSI will check R-TF-028-011 and find nothing — which is worse than not claiming the changes.

Deadline: 2026-04-21

What BSI asked (the question)

BSI reviewed three specific risks from R-TF-028-011 (AI Risk Assessment) and raised four concerns:

The three risks:

Line	ID	Risk	Residual severity	Residual RPN
1	AI-RISK-001	Dataset Not Representative of Intended Use Population	Critical (4)	4 (Acceptable)
16	AI-RISK-016	Model Robustness Failures: Sensitivity to Image Acquisition Variability	Critical (4)	8 (Tolerable)
21	AI-RISK-021	Usability Issues: Model Outputs Not Interpretable by Clinical Users	Moderate (3)	6 (Acceptable)

BSI's four concerns:

Severity justification: Why severity 4 and not 5 (death)? These risks involve misclassification — could that not lead to a delayed cancer diagnosis and death?
Design mitigations: For lines 1 and 16, are there design mitigations (e.g., DIQA image quality rejection)? How have these been verified?
Occurrence rate estimation: What data are the likelihood estimates based on (PMS, literature)?
Residual risk communication: If residual risks remain, where are they communicated to users in the IFU?

What the response promises

The response (already written in response.mdx) describes comprehensive updates to R-TF-028-011 covering all four concerns. It describes:

A severity 4 justification with three regulatory groundings (ISO 14971 Annex C, MDCG 2020-1 VCA, MEDDEV A7.3) and a sensitivity analysis showing that even if severity were 5, risk acceptability categories would not change.
Mitigations explicitly categorised as design (runtime features), process (development procedures), and information (IFU content) per risk, with SRS requirement codes and test case IDs for design mitigations.
Three-step occurrence rate chains per risk: data source → appraisal method → derived value.
Specific IFU section mappings for each residual risk.

What the actual files contain (the problem)

The R-TF-028-011 system consists of two files:

r-tf-028-011-aiml-risk-assessment.mdx: A thin MDX wrapper that renders <AIRiskAssessmentTable /> from the JSON data. It has general methodology text (scales, risk classes) and a summary section but NO individual risk justifications, NO per-risk prose, NO severity rationale.
ai-risk-assessment.json (the data source for the React table): Contains all 29 risks. For the three risks BSI asked about, the current data is:

AI-RISK-001:

residual_severity: "Critical (4)" — no rationale for why not 5
mitigation_measures: flat list of 6 items — NOT categorised as design/process/information
No field for occurrence rate basis
No field for IFU mapping

AI-RISK-016:

residual_severity: "Critical (4)" — no rationale
mitigation_measures: flat list of 9 items — NOT categorised
No occurrence rate basis, no IFU mapping

AI-RISK-021:

residual_severity: "Moderate (3)" — no rationale
mitigation_measures: flat list of 10 items — NOT categorised
No occurrence rate basis, no IFU mapping

There is also an ai-risk-assessment-UPDATED.json file, but it is identical to the original for these three risks — no changes have been made to it either.

What you need to do

There are two possible implementation approaches. Either way, the content must exist before submission.

Option A: Add prose sections to the MDX file

Add dedicated sections below the <AIRiskAssessmentTable /> component in the MDX file. One section per audited risk, each containing:

Severity justification
Categorised mitigations with SRS/test traceability
Occurrence rate derivation
IFU mapping

This is the faster approach and doesn't require changing the React component.

Option B: Extend the JSON schema

Add new fields to the JSON data for each risk:

severity_justification (string)
mitigation_categories (object with design, process, information arrays)
occurrence_basis (object with data_source, appraisal_method, derived_value)
ifu_mapping (array of objects with risk_id, ifu_section, content_summary)

Then update the AIRiskAssessmentTable React component to render these fields. This is the cleaner approach but requires frontend work.

Recommendation: Option A for the deadline. Option B can be done later as a structural improvement.

Detailed content needed per risk

AI-RISK-001 (Dataset representativity)

Severity justification (why 4, not 5):

The device is a CDSS. For a device error to reach severity 5 (death), all of these independent barriers must fail simultaneously: (a) the device misclassifies, (b) the six binary safety indicators (malignant, pre-malignant, urgent referral, etc.) fail to flag the presentation, (c) the clinician does not exercise independent judgment, and (d) the standard of care (biopsy for suspected malignancy) is not followed. Ground this in ISO 14971:2019 Annex C (CDSS guidance), MDCG 2020-1 VCA (device outputs are scientifically correlated, not random), and MEDDEV A7.3 performance data (Top-3 sensitivity 0.9032 for melanoma, AUC 0.97 for malignancy). Include the sensitivity analysis: if severity were 5, RPN would be 5 (still Acceptable).

Categorised mitigations:

Design: The six binary safety indicators (derived from dermatologist-defined mapping matrix R-TF-028-004) flag high-risk presentations independently of ICD classification. The probability distribution output format communicates uncertainty.
Process: Multi-source data collection (R-TF-028-003), stratified sampling across Fitzpatrick phototypes, bias analysis across demographic subgroups (R-TF-028-005).
Information: IFU Important Safety Information § "Population and performance variability" warns about phototype variation. IFU § "Understanding the device output" explains probabilistic nature.

Occurrence rate chain:

Data source: PMS data from legacy device (R-TF-007-003 PSUR): 4,500+ reports, 7 non-serious incidents, zero related to dataset representativity. Clinical validation studies (BI_2024, PH_2024, SAN_2024, IDEI_2023) across Fitzpatrick I-IV with no systematic degradation.
Appraisal method: PMS data appraised using IMDRF MDCE WG/N56 Appendix F as endorsed by MDCG 2020-6 Appendix I.
Derived value: PMS incident rate 0.16% over 4+ years with zero diagnostic-error incidents. Combined with clinical validation showing no subgroup degradation: residual likelihood "Very low" (1).

IFU mapping:

IFU, Important Safety Information, § "Population and performance variability"
IFU, Important Safety Information, § "Understanding the device output"

AI-RISK-016 (Image acquisition variability)

Severity justification: Same multi-barrier argument as AI-RISK-001. Include sensitivity analysis: if severity were 5, RPN would be 10 (Tolerable — no change in acceptability category).

Categorised mitigations:

Design: DIQA model (Dermatology Image Quality Assessment) provides runtime quality gate. Implemented as SRS requirement SRS-Y5W (derived from PRS-7XK). Verified by test cases C50, C62, C68, C73, C77, C106, C329, C370, C371, C454, C455 in R-TF-012-034. All tests passed.
Process: Data augmentation during training, external validation on independent datasets.
Information: IFU "How to take pictures" (lighting, distance, angle, focus). IFU Precautions risk #9 (image artefacts). IFU Precautions risk #30 (inadequate lighting).

Occurrence rate chain:

Data source: PMS data (same as above, zero incidents attributable to image quality failure). DIQA verification (11 test cases, all passed). Clinical validation studies used real-world conditions.
Appraisal method: Same as AI-RISK-001.
Derived value: DIQA provides runtime barrier but borderline images may pass. Residual likelihood "Low" (2).

IFU mapping:

IFU, "How to take pictures" (lighting, distance, angle, focus guidance)
IFU, Precautions, risk #9 (image artefacts affect performance)
IFU, Precautions, risk #30 (inadequate lighting)

AI-RISK-021 (Usability — outputs not interpretable)

Severity justification: Severity is 3 (Moderate), not 4 or 5. Misinterpretation of output leads to suboptimal but recoverable clinical decisions, not permanent harm, because the device output is one input among many in the clinical workflow.

Categorised mitigations:

Design: Explainability media (bounding boxes, segmentation masks) via SRS-0AB and SRS-K7M, verified by test cases C256 and C265 in R-TF-012-034. Probability distribution format inherently communicates uncertainty.
Information: IFU Important Safety Information § "The device does not provide a clinical diagnosis" (prominent non-diagnostic disclaimer). IFU § "Understanding the device output" (probability distribution explanation). IFU Endpoint specification § "Binary Indicators" and § "Entropy" (confidence measure). IFU Troubleshooting (Clinical) (interpretation guidance).

Occurrence rate chain:

Data source: Summative usability evaluation R-TF-025-007 (October 2025, n=36). HCP Scenario 3 Q4: 72.2% success on "device output is not a diagnosis." 1 use error, 3 close calls. Simulated use: 100% success.
Appraisal method: IEC 62366-1:2015 §5.9 residual risk assessment methodology, documented in R-TF-025-007 §14.7.
Derived value: 72.2% Q4 rate and 1 use error support residual likelihood "Low" (2).

IFU mapping:

IFU, Important Safety Information, § "The device does not provide a clinical diagnosis"
IFU, Important Safety Information, § "Understanding the device output"
IFU, Endpoint specification, § "Binary Indicators"
IFU, Endpoint specification, § "Entropy"
IFU, Troubleshooting (Clinical)

Cross-reference between R-TF-028-011 and R-TF-013-002

The response also describes a cross-reference section explaining that the AI risk assessment and the main safety risk register use different severity scales. This needs to be added. The three risks transfer as follows:

AI Risk	Safety Risk IDs	Transfer documented?
AI-RISK-001	R-SKK, R-7US, R-GY6	In JSON field `safety_risk_ids` — but NO prose explanation in the MDX
AI-RISK-016	R-SKK, R-VL1	Same
AI-RISK-021	R-SKK	Same

Add a section to the MDX explaining that the two documents use different severity scales appropriate to their domains, and that AI risks transfer to the main register where they are reassessed on the patient-harm scale.

Where the files live

MDX: apps/qms/docs/legit-health-plus-version-1-1-0-0/product-verification-and-validation/artificial-intelligence/r-tf-028-011-aiml-risk-assessment.mdx
JSON: apps/qms/docs/legit-health-plus-version-1-1-0-0/product-verification-and-validation/artificial-intelligence/ai-risk-assessment.json
UPDATED JSON: apps/qms/docs/legit-health-plus-version-1-1-0-0/product-verification-and-validation/artificial-intelligence/ai-risk-assessment-UPDATED.json

What NOT to change

Do NOT change the severity values (keep severity 4 for AI-RISK-001 and AI-RISK-016, severity 3 for AI-RISK-021). The response argues that severity 4 is correct, not that it should be changed to 5.
Do NOT change the RPN values or risk classes.
Do NOT modify the existing mitigation lists — only add the categorisation structure and the additional traceability information.

Questions before starting

If anything is unclear, ask Taig. The research-and-planning document (research-and-planning.mdx in this folder) has the full gap analysis and regulatory framework.

What BSI asked (the question)​

What the response promises​

What the actual files contain (the problem)​

What you need to do​

Option A: Add prose sections to the MDX file​

Option B: Extend the JSON schema​

Detailed content needed per risk​

AI-RISK-001 (Dataset representativity)​

AI-RISK-016 (Image acquisition variability)​

AI-RISK-021 (Usability — outputs not interpretable)​

Cross-reference between R-TF-028-011 and R-TF-013-002​

Where the files live​

What NOT to change​

Questions before starting​

What BSI asked (the question)

What the response promises

What the actual files contain (the problem)

What you need to do

Option A: Add prose sections to the MDX file

Option B: Extend the JSON schema

Detailed content needed per risk

AI-RISK-001 (Dataset representativity)

AI-RISK-016 (Image acquisition variability)

AI-RISK-021 (Usability — outputs not interpretable)

Cross-reference between R-TF-028-011 and R-TF-013-002

Where the files live

What NOT to change

Questions before starting