Research and planning

Internal working document

This page is for internal planning only. It will not be included in the final response to BSI.

What BSI is asking

The response.json for T377 (ICD Category Distribution, case C537) does not match master.csv. Specifically:

master.csv expects icd_distribution and top_5_predictions keys, but response.json does not have them
The entropy and probability values are similar but not identical

Root cause analysis: INVESTIGATION REQUIRED

Blocker

The root cause has not been confirmed. The S3 evidence must be inspected before a response can be drafted. This is a blocker for writing the final response to BSI.

From master.csv (ai-models-integration-tests.csv, line 279), the expected result for T377 is:

{
  "icd_distribution": {
    "entropy": 0.39412604460588385,
    "top_5_predictions": [
      {"icd_code": "2C30", "name": "Cutaneous melanoma", "probability": 70.807356},
      {"icd_code": "2F20.1", "name": "Atypical melanocytic nevus", "probability": 0.308577},
      ...
    ],
    "full_distribution": [...]
  }
}

BSI says the actual response.json in the evidence folder doesn't match. Critically, BSI states that icd_distribution and top_5_predictions keys are missing from response.json. This is a structural mismatch, not a numerical precision issue. The floating-point differences BSI also mentions ("entropy and probability are similar, but not the same") are secondary to the structural problem.

There are two plausible explanations (ordered by likelihood):

Post-processing layer mismatch (most likely): The Software Architecture (R-TF-012-029) describes the condition classifier response in a study_aggregate.findings.hypotheses structure. The icd_distribution wrapper may be added at a different pipeline stage (e.g., the API gateway's report builder). If the response.json was captured at the model inference layer rather than the API response layer, the JSON structure would differ — it would have raw model outputs without the icd_distribution / top_5_predictions wrapper keys. This also explains the minor numerical differences: the API layer may apply rounding or formatting not present in raw model output.
Evidence compilation error: The response.json provided to BSI was from a different test run or a different version of the model. The evidence in S3 may have been overwritten or incorrectly compiled into the submission package.

The investigation must determine:

What JSON structure does the actual response.json at s3://legit-health-plus/integration-verification/condition-classifier/case-001/evidence/ contain?
Does it have the icd_distribution wrapper? If not, what top-level keys does it have?
Can the numerical values be reconciled with the expected values within the acceptance criteria (≤ 1e-5)?

Relevant QMS documents

Document	Path	Relevance
Integration tests CSV	`ai-models-integration-tests.csv` line 279 (T377)	Expected ICD Category Distribution output
Software Architecture	`R-TF-012-029-Software-Architecture-Description.mdx`	Condition Classifier response structure and pipeline stages
models.json	`models.json` lines 4-25	ICD Category Distribution model specification
AI/ML Release Report	`r-tf-028-006-aiml-release-report.mdx`	Model integration verification package

Gap analysis

Already had: The expected results are well-defined in the CSV. The model produces the correct output structure.
BSI couldn't find: A matching response.json in the evidence folder.
Needs updating: (a) Investigate the actual response.json in S3 to determine the structural mismatch; (b) if the evidence was captured at the wrong pipeline layer, re-capture at the API level; (c) if numerical differences exist within tolerance, explain the acceptance criteria and why small differences are expected; (d) provide corrected evidence.

Response strategy

Regulatory mapping for this response:

Requirement	How our response addresses it
Annex II 6.2(f)	V&V evidence for the ICD Category Distribution model must match the expected output specification. Corrected evidence provided.
EN 62304 §5.5	Integration verification evidence must demonstrate correct integration of the model within the software system

Action required (investigation-dependent — cannot write final response until step 1 is complete):

BLOCKER: Investigate the actual response.json in S3 at s3://legit-health-plus/integration-verification/condition-classifier/case-001/evidence/ to determine the exact content BSI received. Document the JSON structure and values found.
Based on investigation results, one of two responses:

If structural mismatch (no icd_distribution wrapper — most likely):
- Explain that the evidence was captured at the model inference layer rather than the API response layer, per the architecture described in R-TF-012-029
- Explain that the integration test specification in master.csv defines the expected API-level response, and the verification compares at this level
- Provide the correct API-level response as evidence in the supplementary PDF
- Describe corrective action: evidence collection now captures at the API response layer to match the expected output specification
If same structure but different values:
- State the acceptance criteria for classification models (≤ 1e-5 per element, from R-TF-028-006)
- Provide a numerical comparison showing each differing value falls within tolerance
- If values exceed tolerance, explain the cause (e.g., non-deterministic TTA) and provide re-captured evidence
Provide the correct, matching evidence in the supplementary PDF.

Response tone (to be finalised after investigation): "The expected results in the integration test specification (master.csv, T377) define the API-level response structure per R-TF-012-029, including icd_distribution with entropy, top_5_predictions, and full_distribution. [Investigation-dependent explanation]. Per Annex II 6.2(f), we provide corrected evidence demonstrating the model produces the expected output. Corrective action: [investigation-dependent corrective action]."

Action items:

#	Action	Owner	Document affected	Priority
10	BLOCKER: Investigate T377 `response.json` in S3	Gerardo	—	Critical
11	Provide corrected T377 evidence	Gerardo	Supplementary evidence PDF	High

What BSI is asking​

Root cause analysis: INVESTIGATION REQUIRED​

Relevant QMS documents​

Gap analysis​

Response strategy​

What BSI is asking

Root cause analysis: INVESTIGATION REQUIRED

Relevant QMS documents

Gap analysis

Response strategy