SRS-001
Data annotation​
The first phase, data annotation, is essential to understand the inter-observer variability and serves as the foundation for training the algorithm. During this step, medical professionals carefully review individual images and assign intensity values to the specific visual signs we are interested in. In other words, the professionals annotate the images to generate classification labels (either categorical or ordinal) to the later training of the deep learning models.
The key here is selecting the right medical experts and determining an appropriate group size. We typically engage a minimum of three experienced physicians for this task, and for more complex assignments, a larger panel is preferred. We have enlisted the expertise of three doctors who specialise in assessing the severity of pathologies commonly associated with the visual signs we are studying, specifically atopic dermatitis and psoriasis.
By pooling the assessments of these three experts, we establish a ground truth dataset. This dataset serves a dual purpose: it becomes the foundation for training our algorithms and also allows us to gauge inter-observer variability, a critical measure of the accuracy and consistency of our measurements.
Algorithm development​
The next phase involves the development of the algorithms, which will rely on the ground truth data collected during the previous stage. The outcomes generated by the algorithms will then be juxtaposed with the measured variability. This step is particularly crucial since tasks of this complexity, prone to inherent variability, necessitate comparison with the prevailing baseline or state-of-the-art standards. This comparative analysis is essential for validating the algorithm's performance accurately.
It's worth emphasizing that the convolutional neural networks we are training will assimilate knowledge from the collective expertise of specialists. It's important to acknowledge that a significant subjective element is inherent in this process, given the nuanced nature of the task.
When possible, the dataset used to develop each algorithm is split into training, validation, and test sets. However, when the sample size is limited, the data is split into training and validation only to ensure each set contains enough data.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix
of the GP-001
, are:
- Author: Team members involved
- Reviewer: JD-003, JD-004
- Approver: JD-001