R-TF-028-004 Data Annotation Instructions - Binary Indicator Mapping
Table of contents
Context
The Legit.Health Plus device utilizes two main types of AI/ML algorithms for diagnosis support: ICD Category Distribution and Binary Indicators.
The ground truth labels for the ICD Category Distribution algorithm are pre-existing. They have been collected from the source datasets (public atlases and prospective clinical studies) where diagnoses were provided by qualified experts. Therefore, no new annotation is required for the primary ICD-11 categories.
However, the Binary Indicators (Malignancy, Critical Complexity, Dermatological Condition) are derived from the ICD distribution by summing the probabilities of relevant categories. This derivation is governed by a mapping matrix. This document describes the formal process for creating this matrix. The annotation will be performed by a medical expert who will assess each ICD-11 category present in our dataset and assign a boolean value for each of the three indicators based on established medical literature and clinical guidelines.
Objectives
The primary objectives of this annotation procedure are:
- To create a definitive, evidence-based mapping matrix that formally links every unique ICD-11 category in the development dataset to the three binary indicators.
- To ensure this mapping is clinically accurate, consistent, and justifiable based on current medical knowledge and literature.
- To produce a version-controlled artifact that defines the logic for deriving the binary indicators and establishes the ground truth for their subsequent validation and testing, as specified in the
R-TF-028-001 AI/ML Description
.
Annotation Personnel
Role
Medical expert (Dermatologist).
Qualifications
- Required: Board-certified dermatologist.
- Recommended: Extensive clinical experience (>5 years) in diagnosing a comprehensive range of dermatological diseases, including neoplastic, inflammatory, and infectious conditions.
Responsibilities
- To review the complete list of unique ICD-11 categories extracted from the development dataset.
- To complete the mapping matrix by assigning a
TRUE
orFALSE
value to each category for all three binary indicators, following the protocol defined in Section 4. - To provide written justification referencing medical literature for any ambiguous or borderline classifications.
Annotation Protocol
The creation of the mapping matrix follows a structured, multi-step process.
Step 1: Category Extraction
The AI Team will process the final curated dataset and extract a unique, comprehensive list of all ICD-11 categories and their corresponding codes.
Step 2: Matrix Preparation
The AI Team will prepare a data entry spreadsheet (the "mapping matrix"). This spreadsheet will contain:
- Rows: Each row will represent a unique ICD-11 category (e.g., "EA90.40 - Generalised pustular psoriasis").
- Columns: There will be three columns for annotation:
is_malignant
,has_critical_complexity
, andis_dermatological_condition
. - Justification Column: A final column for the annotator to add comments or literature references.
Step 3: Medical Review and Annotation
The designated Medical Expert will review the matrix row by row. For each ICD-11 category, the expert will assign a boolean (TRUE
or FALSE
) value in each of the three indicator columns based on the following guidelines:
Annotation Guidelines:
-
is_malignant
- Assign
TRUE
if the condition is considered malignant (cancerous) or pre-malignant according to established dermatological and oncological standards (e.g., melanoma, squamous cell carcinoma, actinic keratosis). - Assign
FALSE
for all benign conditions.
- Assign
-
has_critical_complexity
- Assign
TRUE
if the condition is considered urgent or an emergency, typically requiring immediate or near-immediate medical intervention to prevent serious deterioration of health, significant morbidity, or mortality (e.g., Stevens-Johnson syndrome, necrotizing fasciitis). - Assign
FALSE
for conditions that are typically non-urgent or can be managed in a routine clinical timeframe.
- Assign
-
is_dermatological_condition
- Assign
TRUE
if the category represents a primary disease of the skin, its appendages (hair, nails), or associated mucous membranes. - Assign
FALSE
for conditions that may have cutaneous manifestations but are fundamentally systemic, non-dermatological, or represent normal skin variants (e.g., skin signs of internal malignancy, striae gravidarum).
- Assign
Step 4: Finalization
Once the matrix is fully populated, the Medical Expert will conduct a final self-review before submitting it to the AI Team.
Quality Control and Review
To ensure the highest level of clinical accuracy and robustness, the following quality control steps will be implemented:
- Primary Review: The completed matrix will be reviewed by the annotating expert to ensure completeness and internal consistency.
- Secondary Review: The completed and justified matrix will be independently reviewed by a second board-certified dermatologist who was not involved in the initial annotation.
- Consensus Resolution: Any discrepancies between the primary annotation and the secondary review will be resolved through a consensus meeting between the two experts. The final decision and its rationale will be documented.
- Final Approval: The consensus-driven matrix is formally approved and version-controlled. This finalized matrix will serve as the definitive logic and ground truth basis for all subsequent validation of the binary indicators.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix
of the GP-001
, are:
- Author: Team members involved
- Reviewer: JD-003, JD-004
- Approver: JD-001