R-TF-015-004 Clinical investigation plan

Scope

This Clinical Investigation Plan (CIP) sets out the rationale, objectives, design, methodology, conduct, implementation, record-keeping, and method of analysis for the clinical investigation.

CIP Identification

	CIP
Title of the clinical investigation	Multi-Reader Multi-Case Study on the Diagnostic Performance of the Device in Autoimmune Dermatoses and Genodermatoses
Device under investigation	Legit.Health Plus
Protocol version	Version 1.0
Date	2026-04-19
Protocol code	LEGIT.HEALTH_AGM_2026
Sponsor	AI Labs Group S.L.
Coordinating Investigator	Prof. Raúl de Lucas Laguna
Principal Investigator(s)	Prof. Raúl de Lucas Laguna
Investigational site(s)	This study is conducted remotely through a centralized web-based platform.
Ethics Committee	This study does not require Ethics Committee approval because it is observational and non-interventional. All data used consists of fully anonymized images sourced from public dermatology atlases and databases, containing no information permitting patient identification. As such, the research meets the criteria for exemption from ethics committee review under applicable regulatory frameworks.

Table of contents

Scope
CIP Identification
Regulatory classification of this study
Trial registration
Compliance Statement
Abbreviations and definitions
CIP or protocol specifications
- Principal Investigator
- Coordinating investigator
- Collaborating Investigator(s)
- Technical Support (AI Labs Group S.L.)
- Investigational sites
- Funding
Product Identification and Description
Justification of the design
- Background and rationale
- Evidence-hierarchy positioning
  - Pre-specified concurrent Pillar 2 (algorithm-level) analysis on the same case set
  - Traceability of the group-level scope restriction into the Clinical Evaluation Plan and Clinical Evaluation Report
- Risks and benefits of the product in investigation and clinical research
Hypothesis
Objectives
- Primary objective
- Secondary objectives
- Exploratory objectives
- Concurrent Pillar 2 (device-alone algorithm-verification) objective
Summary of the study
Design and methods
- Type of clinical research
- Reference standard (ground truth)
- Population
- Sample size
- Dermatological conditions and case composition
- Duration
- Acceptance criteria
- Inclusion criteria
- Exclusion criteria
- Variables
  - Main variable
  - Secondary variables
- Condition of interest
- Limitations of clinical research
Ethical considerations
- Data confidentiality
- Bias minimization measures
- Calendar
- Monitoring plan
- Completion of the investigation
- Statistical analysis
- Data management
CIP Modification
CIP Deviations
Start, follow-up and end reports
Statements of compliance
Informed Consent process
- For patients/image subjects
- For healthcare practitioners
Adverse events, adverse product reactions and product deficiencies
- Adverse Events (AE) and Adverse Events to the Product (AEP)
- Product deficiencies
- Serious Adverse Events, serious adverse events to the product and serious and unexpected adverse events to the product
- Foreseeable adverse events and adverse events to the product
- Data Monitoring Committee (DMC)
Suspension or early termination of clinical research

Regulatory classification of this study

This study does not constitute a clinical investigation under MDR Article 2(45), which defines a clinical investigation as "any systematic investigation involving one or more human subjects, undertaken to assess the safety or performance of a device." In this study:

There are no human subjects (patients). The clinical images are fully anonymized photographs sourced from public dermatological atlases. No patients are recruited, examined, or affected in any way by this study.
The participating healthcare professionals are evaluators, not research subjects. They provide diagnostic assessments in a controlled professional capacity and are not exposed to any intervention or risk.
The study is observational and non-interventional. No clinical decisions, treatments, or patient care pathways are influenced by the study.

Consequently, the requirements of MDR Article 62 (clinical investigations), Articles 73–77 (sponsor obligations), and Annex XV (clinical investigations) do not apply. In particular, competent authority notification (AEMPS), EUDAMED registration, and formal Ethics Committee opinion are not required.

This CIP is prepared in accordance with ISO 14155:2021 as a best-practice methodological framework to ensure scientific rigour and traceability, even though its full regulatory scope is not triggered by this study design. The study results are intended as supporting clinical evidence under MDR Article 61 and MDCG 2020-6, contributing to the triangulated evidence base for autoimmune dermatoses and genodermatoses alongside Pillar 1 literature, Pillar 2 algorithm verification, and Rank 7 post-market surveillance data.

Trial registration

This study is not registered in ClinicalTrials.gov or the EMA RWD Catalogue (EUPAS). Registration is not required because this study does not meet the MDR definition of a clinical investigation (see "Regulatory classification of this study" above). No patients are involved, no clinical interventions are performed, and no clinical decisions are influenced by the study. The study may be voluntarily registered post-hoc if deemed appropriate for transparency purposes.

Compliance Statement

This study is conducted in accordance with the following standards and regulations, to the extent applicable to a non-interventional, observational MRMC study with no patient involvement:

Harmonised standard UNE-EN ISO 14155:2021, applied by analogy as a best-practice methodological framework. Clauses specific to human research subjects (subject informed consent, subject safety monitoring, subject-level adverse-event reporting, subject withdrawal, vulnerable populations) are not triggered by this study because no human subjects are recruited; the participating healthcare professionals are professional evaluators, not research subjects.
Regulation (EU) 2017/745 on medical devices (MDR) — Article 61 (clinical evaluation) and Annex XIV (clinical evaluation and PMCF).
MDCG 2020-1 (clinical evaluation of medical device software, three-pillar framework).
MDCG 2020-6 (sufficient clinical evidence for legacy devices, Appendix III evidence-rank hierarchy).
Harmonised standard UNE-EN ISO 13485:2016.
Regulation (EU) 2016/679 (GDPR).
Spanish Organic Law 3/2018 on the Protection of Personal Data and guarantee of digital rights.

Since this study does not constitute a clinical investigation under MDR Article 2(45), the requirements of MDR Article 62, Articles 73–77, and Annex XV are not applicable. Ethics Committee opinion and competent authority notification are not required (see "Regulatory classification of this study").

Abbreviations and definitions

AE: Adverse Event
AEMPS: Spanish Agency of Medicines and Medical Devices
AEP: Adverse Reaction to Product
AUC: Area Under the ROC Curve
CAD: Computer-Aided Diagnosis
CMD: Data Monitoring Committee
CIP: Clinical Investigation Plan
CUS: Clinical Utility Questionnaire
DLQI: Dermatology Quality of Life Index
GCP: Standards of Good Clinical Practice
ICH: International Conference of Harmonization
IFU: Instructions For Use
IRB: Institutional Review Board
N/A: Not Applicable
NCA: National Competent Authority
PI: Principal Investigator
PPV: Positive Predictive Value
NPV: Negative Predictive Value
SAE: Serious Adverse Events
SAEP: Serious Adverse Event to Product
SUAEP: Serious and Unexpected Adverse Event to the Product
SUS: System Usability Scale

CIP or protocol specifications

Principal Investigator

Prof. Raúl de Lucas Laguna — Head of Paediatric Dermatology, Hospital Universitario La Paz, Madrid.

The Principal Investigator is a senior academic dermatologist with broad general-dermatology practice and internationally recognised subspecialty expertise in genodermatoses (including epidermolysis bullosa, ichthyoses, and neurofibromatosis), immunobullous disease, and rare dermatological conditions. His hospital department operates as a national referral centre for the sub-indications under study within the Spanish National Health System and also manages the full adult general-dermatology caseload, including cutaneous lupus, dermatomyositis, morphea, cutaneous vasculitis, lichen planus and dermatitis herpetiformis. This dual-competence profile — senior general dermatologist plus rare-disease subspecialist — provides direct clinical-interpretation authority across both the autoimmune and genodermatoses halves of the study.

Coordinating investigator

Prof. Raúl de Lucas Laguna.

Collaborating Investigator(s)

To be confirmed upon recruitment. A minimum of 9 healthcare professionals will be recruited for this study (3 dermatologists, 3 primary-care physicians, 3 nurses at the floor), drawn from the device's declared intended user groups (see "Population").

Technical Support (AI Labs Group S.L.)

Mr. Alfonso Medela
Mr. Taig Mac Carthy

Investigational sites

This study is conducted remotely through a centralized web-based platform. Healthcare professionals are provided with individual user credentials (username and password) to securely access the platform. All images are presented through this platform, and practitioners' assessments are recorded on the same system. Access logs are maintained to ensure traceability of all practitioner interactions with the platform.

Funding

This research is carried out without any funding or sponsorship.

Product Identification and Description

	Information
Device name	Legit.Health Plus (hereinafter, the device)
Model and type	NA
Version	1.1.0.0
Basic UDI-DI	8437025550LegitCADx6X
Certificate number (if available)	MDR 000000 (Pending)
EMDN code(s)	Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software)
GMDN code	65975
EU MDR 2017/745	Class IIb
EU MDR Classification rule	Rule 11
Novel product (True/False)	TRUE
Novel related clinical procedure (True/False)	TRUE
SRN	ES-MF-000025345

The specific device version under evaluation is version 1.1.0.0. Device version freeze. The device version is locked for the duration of the data-collection window. Any material update to the device released during data collection triggers one of two responses at the sponsor's pre-specified discretion: (a) data collection is paused, the study is restarted on the new version, and all prior case assessments are excluded; or (b) data collection continues on the frozen version, with the updated version evaluated separately in a subsequent investigation. A material update is defined as any change affecting the diagnostic output pathway (AI model weights, ICD-11 category coverage, severity-assessment logic, referral-recommendation logic, malignancy-surfacing indicators). Non-material updates (cosmetic, deployment, logging, security-patch) do not trigger pause. The CIR will document the device version used for every case assessment; if any case is assessed under a different version due to a protocol-compliant update, that case is excluded from the primary analysis.

Justification of the design

Background and rationale

Dermatological conditions represent a significant portion of primary-care consultations, constituting approximately 5% of all visits. However, discrepancies between diagnoses made by general practitioners and dermatologists remain substantial, with concordance rates between 57% and 65.52%. This gap widens for rare and low-prevalence categories — specifically autoimmune dermatoses (including cutaneous lupus, dermatomyositis, pemphigus, bullous pemphigoid, lichen planus, morphea, vasculitis, and dermatitis herpetiformis) and genodermatoses (including ichthyoses, epidermolysis bullosa, neurofibromatosis type 1 cutaneous manifestations, hereditary acantholytic dermatoses, palmoplantar keratodermas, and xeroderma pigmentosum) — where first-line clinicians have limited pattern-recognition exposure.

Artificial intelligence presents a transformative opportunity to enhance the diagnostic capabilities of healthcare professionals on precisely these low-prevalence presentations, where specialist consultation is less readily available and delayed recognition has the highest clinical cost. The device has already been validated for the diagnosis of skin conditions across multiple clinical investigations (SAN_2024, BI_2024, PH_2024, MAN_2025), demonstrating statistically significant improvements in diagnostic accuracy when healthcare professionals are assisted by the device.

The existing clinical evidence base for the device is weighted toward higher-prevalence inflammatory, infectious, and neoplastic categories. Autoimmune dermatoses (approximately 3% of the legacy post-market surveillance case mix) and genodermatoses (approximately 1%) are under-represented in the pre-market investigation portfolio. This study is specifically designed to contribute supporting evidence for the device's diagnostic performance on these sub-indications, as one element of a triangulated evidence base that also includes Pillar 1 state-of-the-art literature review, Pillar 2 per-category algorithm verification, and Rank 7 legacy post-market surveillance data.

Evidence-hierarchy positioning

Per MDCG 2020-6 Appendix III, multi-reader multi-case (MRMC) reader studies are Rank 11 evidence and do not meet the strict definition of clinical data under MDR Article 2(48). The conclusions of this study are therefore positioned as supporting evidence, not load-bearing Pillar 3 clinical performance evidence. The study's role within the clinical evaluation is to demonstrate, under a pre-specified reader-study protocol with an atlas-labelled reference standard, that intended users (board-certified dermatologists, primary-care physicians, and nurses with skin/wound assessment scope) achieve clinically relevant outputs when assisted by the device on images representative of autoimmune dermatoses and genodermatoses. The inferential scope is restricted to the group level (autoimmune vs. genodermatoses combined); per-condition results are reported descriptively only, and per-condition claims are explicitly outside the scope of this study's conclusions.

Pre-specified concurrent Pillar 2 (algorithm-level) analysis on the same case set

As a pre-specified secondary output of this study, the device's diagnostic API is invoked on each anonymized clinical image before the image is presented to any reader. The device's ICD-11 probability distribution is captured per image and stored against the atlas-labelled ground-truth ICD-11 code. This yields per-ICD-11-category device-alone sensitivity, specificity, positive and negative predictive value, and Top-1 / Top-3 / Top-5 accuracy on the MRMC case set, computed independently of reader performance and reported as Pillar 2 algorithm-verification evidence under MDCG 2020-1.

Because the Rank 11 reader analysis and the Pillar 2 device-alone analysis are computed on the same case set, the triangulation of Pillars 2, 3, and 11 for the autoimmune and genodermatoses sub-indications is internally consistent — the reader study measures the incremental clinical benefit of the device on an image set whose device-alone performance has been characterised per category on the same images.

Traceability of the group-level scope restriction into the Clinical Evaluation Plan and Clinical Evaluation Report

The group-level inferential restriction declared above is a binding claim boundary on the downstream clinical evaluation. This is propagated into the surrounding clinical-evaluation documentation as follows:

The Clinical Evaluation Plan (R-TF-015-001) restricts per-condition performance claims for autoimmune and genodermatoses sub-indications to post-market scope only; this study does not support per-condition claims.
The Clinical Evaluation Report (R-TF-015-003), in its §Representativeness and §Sufficiency determination, cites this study at group level exclusively and does not generalise per-condition descriptive results into claims.
The PMCF Plan (R-TF-007-002) pre-specifies the post-market activities that address residual per-condition uncertainty with pre-specified enrolment targets and diagnostic-accuracy thresholds.
The triangulated evidence base for these sub-indications — Pillar 1 state-of-the-art literature, Pillar 2 per-category algorithm verification, Rank 7 legacy post-market surveillance, and this Rank 11 MRMC — is summarised in the Clinical Evaluation Report as a single triangulated-evidence section, with each pillar's contribution tabled transparently so that the reader can verify scope alignment end-to-end.

Risks and benefits of the product in investigation and clinical research

In this study, there is no patient recruitment or active patient involvement. The images used are completely anonymized clinical images sourced from public dermatological atlases and freely available public sources. These images are not derived from identifiable patients, and their anonymization makes patient recognition impossible. Using the device could optimize diagnostic accuracy in clinical practice, potentially save consultation time and costs, and support better clinical decision-making for patients with rare autoimmune or genetic dermatological conditions — precisely the patient subgroups most affected by delayed recognition. The participating HCPs will sign a contract with the sponsor to regulate their participation in the study.

Hypothesis

The information provided by the device increases the atlas-referenced diagnostic accuracy of healthcare professionals (HCPs) in the diagnosis of autoimmune dermatoses and genodermatoses, as measured by reader-averaged Top-1 and Top-3 diagnostic accuracy against an atlas-labelled reference standard.

Objectives

Primary objective

To validate that the information provided by the device increases the reader-averaged atlas-referenced Top-1 diagnostic accuracy of healthcare professionals on anonymized images representative of autoimmune dermatoses and genodermatoses, when compared with unassisted diagnosis on the same cases.

Secondary objectives

To validate that the information provided by the device increases the reader-averaged Top-3 diagnostic accuracy on the same case set.
To assess the appropriateness of the referral decisions made by HCPs with and without the information provided by the device.
To confirm that the incremental diagnostic benefit observed at the group level is consistent across the intended user groups represented in the reader cohort (dermatology, primary care, nursing).

Exploratory objectives

Per-condition descriptive diagnostic accuracy (no inferential claim).
Reader-to-reader concordance (Cohen's kappa) in unassisted and assisted conditions.
Device-output specificity on distractor conditions (rate at which the device appropriately does not surface an autoimmune or genodermatosis diagnosis when the true diagnosis is a clinical mimic from another category).

Concurrent Pillar 2 (device-alone algorithm-verification) objective

To characterise, on the same case set evaluated by the readers, the device's per-ICD-11-category and per-group diagnostic performance (sensitivity, specificity, positive and negative predictive value, Top-1 / Top-3 / Top-5 accuracy) against the atlas-labelled reference standard, independently of reader behaviour. This output contributes to the MDCG 2020-1 Pillar 2 technical-performance section of the Clinical Evaluation Report for the autoimmune and genodermatoses sub-indications.

Summary of the study

This is a prospective, observational multi-reader, multi-case (MRMC) self-controlled study. It is designed to assess whether the use of the device by healthcare professionals increases the accuracy in the diagnosis of autoimmune dermatoses and genodermatoses. A minimum of 9 healthcare professionals representative of the device's declared intended user groups — dermatologists, primary-care physicians, and nurses with clinical responsibility for skin assessment — will be presented with a standardised image set of 60 to 100 anonymized clinical cases covering autoimmune dermatoses, genodermatoses, and pre-specified distractor conditions. Data collection will include diagnostic accuracy at Top-1 and Top-3, referral-appropriateness, and reader-specialty breakdowns. The study adheres to strict ethical guidelines, ensuring data confidentiality and compliance with international standards.

Design and methods

Type of clinical research

This is a prospective, observational, multi-reader, multi-case (MRMC) self-controlled study to evaluate whether the use of the device by healthcare professionals helps to increase the accuracy in the diagnosis of autoimmune dermatoses and genodermatoses. The study uses a progressive information disclosure design: for each clinical case, the reader completes a sequence of three assessment stages that reveal progressively more device output. This within-case, within-reader comparison isolates the device's incremental contribution to diagnostic accuracy.

The three stages per case are:

Unassisted diagnosis (Stage 1): The reader views the clinical image and patient anamnesis, and provides their primary diagnosis without any device output.
Assisted diagnosis (Stage 2): The device's differential diagnosis (ICD-11 probability distribution) is additionally displayed. The reader provides their revised diagnosis.
Referral assessment (Stage 3): The device's referral recommendation and diagnostic entropy are additionally displayed. The reader decides whether to refer the patient and records the clinical urgency.

The stages are completed sequentially for each case before the reader proceeds to the next case. The order of case presentation is independently randomised for each reader to prevent order effects. The self-controlled comparison between Stage 1 (unassisted) and Stage 2 (assisted) for the same reader on the same case constitutes the primary paired observation.

This sequential per-case design mirrors the intended clinical workflow, where a healthcare professional first forms an initial diagnostic impression and then consults the device as a decision-support tool. Any carry-over effect from the unassisted diagnosis to the assisted diagnosis is conservative: readers anchored to their Stage 1 diagnosis are less likely to change it in Stage 2, thereby underestimating rather than overestimating the device's incremental benefit. The methodological framework has been applied consistently in four prior MRMC studies (SAN_2024, BI_2024, PH_2024, MAN_2025) whose results are reported in the Clinical Evaluation Report under Rank 11 supporting evidence.

Reference standard (ground truth)

The reference standard for diagnostic accuracy is the published atlas diagnosis — i.e., the diagnosis assigned to each clinical image by the originating public dermatological atlas from which the image was sourced. Each image in the dataset was sourced from a peer-reviewed or institutional dermatological atlas where cases are labelled by expert dermatologists, typically based on clinical-pathological correlation. For the rare autoimmune and genodermatoses categories, atlas entries are commonly histopathologically or genetically confirmed as a condition of atlas inclusion; where such confirmation is recorded in the source atlas metadata, this is preserved in the study case record.

The ground-truth diagnosis is encoded as an ICD-11 code for each case and was established prior to and independently of this study. The reference standard is not modified or influenced by the device output or by the participating readers' assessments.

Limitations of the reference standard: For autoimmune dermatoses and genodermatoses, the clinical standard of care for definitive diagnosis is histopathological or genetic confirmation; atlas-only labels are therefore a weaker reference standard than the clinical gold standard for these categories. The impact of this limitation on accuracy calculations is mitigated by four pre-specified design and analytic controls:

Self-controlled design: both unassisted and assisted conditions are evaluated against the same reference standard, so any reference-standard error affects both arms equally. This mitigation is valid for the difference in accuracy (Δ Top-1, Δ Top-3) — the primary and secondary endpoints of this study.
Absolute accuracy is descriptive-only: absolute Top-1 and Top-3 accuracy values are reported descriptively only and are not used for comparison against state-of-the-art performance benchmarks. The state-of-the-art comparison in the clinical-evaluation narrative is carried by Pillar 1 literature (where the published accuracy figures are histopathology- or genetics-confirmed), not by this study's absolute values.
Pre-specified histopath/genetic-confirmation sensitivity analysis: each atlas entry is flagged in the study database at case ingestion with its confirmation status (histopath-confirmed / genetics-confirmed / clinical-diagnosis-only / unknown). The primary endpoint (Δ Top-1) is reported on the full case set and separately on the confirmed-subset. If the two estimates diverge materially (absolute difference > 5 percentage points, or a change in statistical significance at α = 0.05), the primary claim degrades to descriptive and the CIR reports the finding as inconclusive, triggering the triangulated-evidence contingency (see "Pre-specified floor-sample contingency").
Group-level inference scope: per-case reference-standard errors are diluted across a category group of 25–35 cases, and per-condition claims are explicitly out of scope (see "Evidence-hierarchy positioning").

Population

In this study, the population will consist of healthcare professionals (HCPs) representing the device's declared intended user groups. Per the IFU (Intended Purpose, §Intended user), the device is intended for use by healthcare providers to aid in the assessment of skin structures. The eligible reader categories are accordingly:

Board-certified dermatologists (specialist in dermatology), or dermatology trainees (residents) in an accredited residency programme.
Board-certified primary-care physicians (specialist in family and community medicine / general practitioners), or primary-care trainees (residents) in an accredited residency programme.
Nurses with clinical responsibility for skin or wound assessment, with documented professional experience in dermatology, primary care, wound care, or a related clinical area.

This reader population reflects the full range of intended users of the device as declared in the IFU and is consistent with the multi-specialty reader cohort of the preceding MRMC study MAN_2025, which included all three specialty groups. A minimum of 9 HCPs will participate in the study, with a target of 11–13. The floor enforces a minimum of 3 readers per specialty to preserve per-specialty inferential validity for the Secondary-3 endpoint.

Sample size

This study aims to evaluate whether the use of the device improves reader-averaged Top-1 diagnostic accuracy by at least 5 percentage points (floor) — with a target effect of 10 percentage points — among healthcare professionals diagnosing autoimmune dermatoses and genodermatoses. The minimum-clinically-meaningful delta of 5 percentage points is justified by the low baseline accuracy expected for non-specialist clinicians on rare dermatological categories, where any reproducible incremental benefit has clinical value.

Pre-specified sample design

Parameter	Floor	Target	Rationale
Total readers	9	11–13	Exceeds Obuchowski/Hillis minimum of 5; multi-specialty representation with per-specialty inferential floor
Dermatologists	3	4	Rare-condition expertise anchor
Primary-care physicians	3	4–5	Primary-care representativeness (first-line users)
Nurses	3	3–4	Skin/wound-assessment scope per IFU intended-user declaration; ≥ 3 readers required for per-specialty inference
Total cases	60	80–100	Per-group power at group-level inference scope
Autoimmune cases	25	35	≥ 8 conditions × 3–4 images per condition
Genodermatoses cases	25	35	≥ 6 conditions × 3–4 images per condition
Distractor cases	10	15–30	Specificity test (clinical mimics; see distractor composition)
Images per reader	60	80–100	Same set for all readers (within-reader cross-over)

Justification of the per-group floor

The 25-image floor per category group (autoimmune and genodermatoses) is below the generic 30-image floor recommended for group-level reader-study inference. This deviation is pre-specified and justified by the structural scarcity of atlas-quality images for ultra-low-prevalence rare conditions — particularly xeroderma pigmentosum and some epidermolysis bullosa sub-types, where publicly available atlas coverage is inherently limited. The inferential scope of this study is restricted to group level (autoimmune combined vs. genodermatoses combined vs. distractors), consistent with the triangulated-evidence framing in which this MRMC is supporting rather than load-bearing evidence. Per-condition results are reported descriptively and are explicitly outside the scope of any statistical claim.

Power calculation

The sample size was determined using a two-sided McNemar's test for paired binary outcomes (correct/incorrect diagnosis, with and without the device), with the following parameters:

Baseline accuracy (unassisted, group level): 40%. This is a conservative estimate based on published literature on non-specialist diagnostic accuracy for rare dermatological categories, reflecting the expected performance floor for autoimmune and genodermatoses presentations.
Expected accuracy (assisted): 50%. This represents the target clinically meaningful improvement of 10 percentage points at the group level.
Minimum clinically meaningful delta: 5 percentage points (primary-endpoint floor).
Significance level (alpha): 0.05, two-sided.
Statistical power (1 − beta): 0.80.
Discordant proportion: Under the assumption that 30% of paired observations are discordant (higher than for higher-prevalence categories, reflecting greater reader-to-reader variability on rare conditions), approximately 175 paired observations are required per McNemar's test (Lachin, 1992).

At the target sample (11 readers × 85 cases = 935 paired observations; or 13 readers × 100 cases = 1,300 paired observations), the study is well powered for the target 10-percentage-point effect even after accounting for within-reader correlation inherent in the MRMC design. Adjusting for intra-reader correlation (ICC ≈ 0.15, estimated from the preceding MRMC studies), the effective sample size exceeds the minimum of 175 required.

At the floor sample (9 readers × 60 cases = 540 paired observations), the study remains adequately powered to detect the target 10-percentage-point effect but is under-powered to detect the 5-percentage-point floor effect. This is reported transparently: the primary endpoint is declared "Met" only if the observed effect reaches the 5-percentage-point minimum with p < 0.05; a non-significant result at the floor sample is reported as Inconclusive rather than Negative, and triggers the pre-specified triangulation contingency below.

Pre-specified floor-sample contingency (triangulation analytic content)

If the primary endpoint returns Inconclusive at the floor sample, the CIR reports — transparently and side-by-side — the following pre-specified triangulation table, with each pillar's contribution and limitations separately tabled. This is a narrative synthesis under the MDCG 2020-1 three-pillar framing, not a formal meta-analysis:

Pillar 1 — State-of-the-art literature (Valid Clinical Association): the per-category-group SotA accuracy values for image-based diagnostic recognition of autoimmune dermatoses and genodermatoses, appended to R-TF-015-011 State of the Art as the pre-specified per-group literature summary. Minimum ≥ 10 peer-reviewed references per category group, CRIT1-7 appraised.
Pillar 2 — Per-category algorithm verification (Technical Performance): per-ICD-11-category sensitivity, specificity, PPV, NPV, and Top-1 / Top-3 / Top-5 accuracy produced by this study itself as a pre-specified concurrent analysis (device-alone outputs on the same atlas image set the readers evaluate; see the "Device-alone per-category analysis" subsection of the statistical analysis plan). Tabled per autoimmune and genodermatoses group and cross-referenced to R-TF-028 verification and validation records; strengthened by and consistent with the manufacturer's aggregate algorithm-verification V&V on broader datasets.
Rank 7 — Legacy post-market surveillance (Article 61(5)–(6) equivalence): case volumes, complaint rates, and event-rate cuts for autoimmune and genodermatoses presentations within the legacy post-market surveillance corpus, with rule-of-three upper bounds applied where zero events are observed; reported from R-TF-007-003 Legacy PMS Report.
This Rank 11 MRMC: the inconclusive Δ Top-1 point estimate with 95 % confidence interval, the Δ Top-3 secondary, and the descriptive per-condition and specialty breakdowns.

The combination rule is descriptive narrative synthesis: each pillar's evidence is presented with its rank, its specific contribution to the category-group claim, and its limitations. The CIR does not combine these into a single summary statistic; instead it demonstrates that the category-group claim is independently supported by at least two of the four pillars, which is the MDCG 2020-6 §6.3 sufficiency bar for supporting sub-indications where prospective real-patient Pillar 3 evidence is not generated pre-market. If fewer than two pillars independently support the claim, the CIR reports this finding and triggers a CEP update restricting the corresponding claim to post-market scope.

For the MRMC-specific analysis (Obuchowski-Rockette method), the minimum of 9 readers exceeds the recommendations of Hillis (2011) for detecting reader-averaged differences in diagnostic accuracy, given the number of cases available per category group.

Selection of the reader pool

The recruitment of a minimum of 9 healthcare professionals — 3 dermatologists, 3 primary-care physicians, and 3 nurses at the floor, scaling to 4 / 4–5 / 3–4 at the target — ensures adequate inter-observer variability, representation of all three intended user groups with a per-specialty inferential floor of 3 readers, and sufficient statistical power for the primary group-level endpoint. The higher proportion of primary-care physicians relative to dermatologists reflects the clinical reality that first-line diagnostic encounters for rare dermatological conditions occur predominantly in primary care, where diagnostic-decision support has the highest incremental clinical value. The dermatology cohort anchors the specialist reference performance; the primary-care cohort represents the highest-volume real-world intended-user group; the nursing cohort represents the skin/wound-assessment practice scope declared in the IFU.

Dermatological conditions and case composition

The image set covers autoimmune dermatoses, genodermatoses, and distractor conditions drawn from clinical mimics of the target categories. The distractor set is included to evaluate device-output specificity — i.e., the rate at which the device appropriately does not surface an autoimmune or genodermatosis diagnosis when the true diagnosis is a clinical mimic from another category.

Autoimmune dermatoses (target: ≥ 8 conditions)

Cutaneous lupus erythematosus (ICD-11 4A40.0Z)
Dermatomyositis (4A41.2)
Pemphigus vulgaris (EB40.0)
Bullous pemphigoid (EB41.0)
Lichen planus (EA91)
Morphea / localised scleroderma (EB60)
Cutaneous small-vessel vasculitis (4A44.A)
Dermatitis herpetiformis (EB44)

Genodermatoses (target: ≥ 6 conditions)

The six conditions below are selected to span the phenotypic range of genodermatoses presentations represented in the device's classifier output space. Each condition maps to one or more ICD-11 codes that are part of the device's validated output distribution.

Ichthyosis vulgaris and other non-syndromic ichthyoses (EC20.0Y, EC20.Y)
Epidermolysis bullosa (EC3Z, and dystrophic EB: EC32)
Neurofibromatosis type 1, cutaneous manifestations (LD2D.1Z)
Hereditary acantholytic dermatoses (Darier disease, Hailey-Hailey disease) (EC20.2)
Palmoplantar keratodermas — diffuse, focal, and papular types (EC20.30, EC20.31, EC20.32)
Xeroderma pigmentosum (LD27.1)

Stretch condition — xeroderma pigmentosum. Atlas coverage for XP is structurally limited by its ultra-low prevalence (approximately 1 in 250,000 to 1 in 1,000,000). A minimum of 1 XP image and a target of 2 XP images is pre-specified; if fewer than 1 XP image passes quality control, XP is dropped from the target condition list without protocol amendment, provided the per-group image floor (25 images per category group) is still achieved via the remaining five conditions.

Out-of-scope genodermatoses. Tuberous sclerosis complex (TSC) and Gorlin syndrome (naevoid basal-cell carcinoma syndrome) are part of the clinical differential diagnosis for genodermatoses but are not represented in the device's current ICD-11 classifier output space. Including these conditions as target cases would systematically disadvantage the device's measured accuracy because the device cannot output the correct diagnosis for codes outside its class list. These conditions are therefore excluded from the target set of this study. Device performance on TSC and Gorlin syndrome presentations is addressed separately in the post-market clinical follow-up plan.

Distractor conditions (pre-specified clinical mimics)

Distractors are selected to test the specificity of both reader and device outputs against common clinical mimics of the target categories. The pre-specified distractor set covers:

Plaque psoriasis (EA90.0) — mimics lichen planus, cutaneous lupus
Atopic dermatitis, severe (EA80) — mimics ichthyosis presentations
Seborrhoeic dermatitis (EA81) — mimics cutaneous lupus
Rosacea (ED90.0) — mimics dermatomyositis malar involvement
Tinea corporis (1F28) — mimics morphea, lichen planus
Common melanocytic naevi (2F20.Z) — mimics NF1 café-au-lait macules
Seborrhoeic keratosis (2F21.0) — mimics hyperkeratotic genodermatoses
Impetigo (1B72) — mimics pemphigus/bullous pemphigoid presentations

The exact count per condition will be documented in the study database at lock and reported in the CIR. Per-condition counts will not exceed 4 images per condition for autoimmune and genodermatoses categories in order to preserve per-condition homogeneity for descriptive reporting.

Duration

The total duration of the study is estimated at 4 months, including the time required after the recruitment of the participating HCPs and the collection of the corresponding images for the closing and editing of the database, the analysis of the data, and the preparation of the final report of the study.

Acceptance criteria

The acceptance criteria below are pre-specified before data collection begins. The primary endpoint is assessed first; secondary endpoints are assessed only if the primary endpoint is met, in hierarchical order.

Note: No acceptance criteria found for study code "AGM_2026" in performanceClaims.ts

The following acceptance criteria are also pre-specified in protocol prose to ensure they are auditable independently of the performance-claims cross-reference:

Priority	Endpoint	Pre-specified threshold (Met)	Inferential scope
Primary	Δ Top-1 accuracy (assisted − unassisted)	≥ 5 percentage points (floor); target ≥ 10 pp; p < 0.05, two-sided, OR method	Group level (autoimmune + genodermatoses combined)
Secondary 1	Δ Top-3 accuracy (assisted − unassisted)	≥ 10 percentage points; p < 0.05, two-sided, OR method	Group level
Secondary 2	Referral-appropriateness (assisted vs. unassisted)	Proportion of appropriately referred cases not inferior to unassisted (margin 5 pp)	Group level
Secondary 3	Per-specialty consistency	No specialty group shows a statistically significant deterioration (p < 0.05)	Per-specialty (derm / PCP / nursing)
Exploratory	Per-condition Top-1 accuracy	Descriptive only — no threshold	Per condition
Exploratory	Device-output specificity on distractors	Descriptive — proportion of distractor cases where the device's top-1 output is not an autoimmune or genodermatosis ICD-11 code	Distractor set

An endpoint is declared Met only if the pre-specified threshold is achieved. A non-significant result at the floor sample is reported as Inconclusive, not Negative, and triggers supplementary analyses within the triangulated clinical-evaluation evidence base.

Inclusion criteria

Reader (HCP) inclusion criteria. To be eligible, a reader must satisfy all of the following:

Belong to one of the device's declared intended user groups, namely: (a) board-certified dermatologists or dermatology residents; (b) board-certified primary-care / family and community medicine physicians or primary-care residents; or (c) nurses with clinical responsibility for skin or wound assessment.
Have a clinical scope of practice that routinely includes the assessment of skin conditions.
Complete the study onboarding flow: professional profile, CV, evidence of board certification or current training, conflict-of-interest declaration, signed participation agreement, and data-protection acknowledgement.

Image inclusion criteria:

High-quality anonymized images of autoimmune dermatoses, genodermatoses, or pre-specified distractor conditions sourced from public dermatological atlases.
A published atlas diagnosis encoded as an ICD-11 code.

Exclusion criteria

Reader (HCP) exclusion criteria. A reader is screened out of the study — and any data collected from them is excluded from all analyses — if any of the following applies:

The reader's current specialty or scope of practice does not routinely include the clinical assessment of skin conditions. Examples of excluded specialties include anatomical pathology, clinical neurophysiology, radiology, laboratory medicine, and purely non-clinical administrative roles. Dermatopathology is eligible only where the practitioner documents active involvement in clinical dermatological assessment, not solely histological review.
A conflict of interest is declared that cannot be adequately managed under the conflict-of-interest handling procedure.
The onboarding flow (professional profile, CV, certification/training evidence, signed agreement, data-protection acknowledgement) is not completed.
Substantial incompletion at data lock (fewer than 95% of cases completed per assessment stage).

Image exclusion criteria:

Low-quality images which cannot be properly analyzed.
Images where the skin-condition morphology has been altered beyond clinical recognition.
Images without a published atlas ICD-11 ground-truth label.

Variables

Main variable

The main variable of this study is diagnostic performance (accuracy, sensitivity, and specificity) in autoimmune dermatoses and genodermatoses using the device, measured against the atlas-labelled reference standard diagnosis (ICD-11 code). Performance will be calculated both with and without the use of the device.

Top-1 diagnostic accuracy: Proportion of cases where the reader's primary diagnosis matches the reference standard.
Top-3 diagnostic accuracy: Proportion of cases where the reference standard diagnosis appears among the reader's top 3 differential diagnoses.

Secondary variables

Referral-appropriateness: proportion of cases where the reader's referral decision (refer / do not refer, with urgency) matches the clinically appropriate decision inferred from the reference standard.
Per-specialty consistency: reader-averaged Top-1 and Top-3 accuracy, stratified by specialty (dermatology / primary care / nursing).
Device-output specificity on distractor cases: proportion of distractor cases where the device's top-1 output is not an autoimmune or genodermatosis ICD-11 code.
Device-alone per-ICD-11-category diagnostic performance (Pillar 2 output): for each ICD-11 code represented in the case set, the sensitivity, specificity, positive predictive value, negative predictive value, and Top-1 / Top-3 / Top-5 accuracy of the device's diagnosis-support output against the atlas-labelled reference standard. Computed per autoimmune code and per genodermatosis code on the MRMC case set, reported in the CIR as the Pillar 2 algorithm-verification section and cross-referenced in the Clinical Evaluation Report as per-category device performance on these sub-indications.

Condition of interest

Autoimmune dermatoses and genodermatoses as depicted in anonymized clinical images sourced from public dermatological atlases, supplemented by pre-specified distractor conditions drawn from common clinical mimics.

Limitations of clinical research

The main limitations of this study include several factors that may influence the perception and effectiveness of the device. Firstly, the acceptance and trust of healthcare professionals in emerging AI-based diagnostic-decision-support technologies can vary significantly. The device's effectiveness may be compromised if users are not fully convinced of its accuracy or usefulness, thereby affecting the overall perception of its performance.

Additionally, image quality is crucial for the device's performance. Issues such as low-quality photographs, errors in cropping lesions, or variations in lighting and focus can deteriorate the quality of the data received by the system, which may negatively influence the evaluation of its effectiveness.

Variability in image conditions is an important aspect to consider. Differences in lighting, colour, shape, size, and focus of the images, along with the number of images available for each condition, can affect the accuracy of the results. In particular, the structural scarcity of publicly available atlas images for low-prevalence rare conditions limits per-condition sample sizes to 3–4 images per condition. This is a known constraint of rare-disease reader-study research and is the primary reason the inferential scope of this study is restricted to group level rather than per-condition level.

Furthermore, the consistency of readers in using the device is crucial. Variations in how diligently readers engage with the device's output can impact the study's findings.

Another limitation is the Hawthorne effect, where study participants may change their behaviour simply because they know they are being observed. This awareness can influence their decisions and actions within the study, potentially skewing the results and not accurately reflecting how the device would be used in a non-study environment.

Finally, this study is limited to autoimmune dermatoses and genodermatoses presentations and their clinical mimics. The results are intended to contribute as supporting evidence within a triangulated evidence base for these sub-indications, and should be interpreted in the context of the broader clinical-evaluation evidence base that covers all dermatological indications of the device.

Ethical considerations

This study does not require Ethics Committee approval. No patients are involved — all images are fully anonymized and sourced from public dermatological atlases containing no identifiable patient information. The participating healthcare professionals are not research subjects; they are professional evaluators providing diagnostic opinions in a controlled setting. The study is observational and non-interventional: no clinical decisions, treatments, or patient care pathways are influenced by this study.

This exemption is consistent with the regulatory classification described in "Regulatory classification of this study" and with the approach accepted for the preceding MRMC studies (SAN_2024, BI_2024, PH_2024, MAN_2025), which used the same study design and were conducted without Ethics Committee approval under the same justification.

Data confidentiality

All study data are collected and processed in compliance with Regulation (EU) 2016/679 (GDPR) and Spanish Organic Law 3/2018. The study images are fully anonymized and contain no patient-identifiable information. Each participating practitioner is assigned a unique study identification code; the list linking identification codes to practitioner identities is maintained by the Principal Investigator in a secure area with restricted access. No personal data are included in the analysis dataset or in any study report.

Bias minimization measures

In clinical research, minimizing bias is essential to ensure the validity and reliability of the study's results. The following measures are implemented:

Self-controlled design: Each reader serves as their own control for every case, eliminating between-reader variability as a confounding factor. The paired comparison (unassisted vs. assisted) for the same reader on the same case isolates the device's incremental contribution.
Progressive disclosure: The sequential presentation of device information within each case ensures that the unassisted diagnosis (Stage 1) is recorded before the reader sees any device output, preventing information leakage from the assisted condition to the unassisted condition.
Conservative carry-over: Any recall of the Stage 1 diagnosis during Stage 2 biases conservatively, as readers anchored to their initial diagnosis are less likely to adopt the device's suggestion, underestimating rather than overestimating the device's benefit.
Randomised case order: The order of case presentation is independently randomised for each reader, preventing order effects and ensuring that fatigue or learning effects do not systematically bias results for specific cases.
Standardized protocol: All participants follow the same procedures for conducting the study and recording outcomes, reducing variability due to differences in how the device is used.
Prospective data collection: Collecting data prospectively reduces the chance that participants or investigators will inaccurately recall past events.
Pre-defined endpoints: Before the study begins, primary and secondary endpoints and their acceptance thresholds are defined and documented in this CIP, preventing selective reporting of only favourable outcomes.
Pre-specified inferential scope: The scope of inferential claims is fixed at group level before data collection begins; per-condition results are reported descriptively only.
Distractor specificity test: The inclusion of pre-specified distractors prevents a rare-condition MRMC from collapsing into a recognition-only design where the device is rewarded for correctly surfacing the rare diagnosis regardless of the clinical presentation.
Reference-standard blinding: Readers are blinded to the atlas ICD-11 reference label throughout all three assessment stages. The reference label is used only by the analysis pipeline for accuracy computation and is never displayed on the reader interface.

Calendar

The total duration of the study is estimated at 4 months, including the time required after the recruitment of the participating HCPs and collection of all images, and for closing and editing the database, data analysis, and preparation of the final study report.

Monitoring plan

The sponsor will hold a meeting with the investigative team at the beginning of the study to address any potential questions and ensure that data is being collected properly.

The clinical investigation will be monitored by a designated clinical monitor, who is independent of the investigational site and appointed by the sponsor. The purpose of monitoring is to ensure that:

The rights, safety, and well-being of the subjects are protected.
The data reported are accurate, complete, and verifiable from source documents.
The clinical investigation is conducted in compliance with the Clinical Investigation Plan (CIP), applicable regulatory requirements (including Regulation (EU) 2017/745), and ISO14155:2020.

In this way, monitoring will be performed through:

Remote monitoring activities: Including scheduled video or telephone meetings every 3 months with the investigators to review study progress, discuss challenges, and ensure ongoing compliance.
On-site visits: If deemed necessary based on data review, protocol deviations, or issues raised during remote monitoring. In this study, this will be carried out online.
Source data verification (SDV): The sponsor in this case has secure access to anonymised source documents (e.g., image files, clinical assessment records) via encrypted digital platforms or controlled site systems to verify data accuracy and consistency with Case Report Forms (CRFs).
Risk-based monitoring approach: The extent and frequency of monitoring activities will be adapted to the complexity and risk level of the investigation. Given that this study does not involve significant deviation from routine care, the monitoring will be primarily remote unless specific issues arise.

All monitoring activities will be documented in monitoring visit reports. Any protocol deviations or non-compliance identified will be documented, assessed for impact, and communicated promptly to the sponsor. Corrective actions will be tracked and followed up by the monitor or the sponsor.

Completion of the investigation

After the final closure of the study, a Clinical Investigation Report (CIR: T-015-006 Clinical Investigation Report) will be drafted, even in the event of early termination or suspension. The results obtained (whether positive, inconclusive, or negative) will be documented.

Additionally, if deemed appropriate, the results may be published in scientific journals. All the investigators who approved this clinical investigation will be acknowledged, and any funds received by the author for the study and its source of funding will be disclosed. The anonymity of participants in the clinical investigation will be maintained at all times.

Upon completion of the study, the results may be presented at conferences and scientific meetings, subject to prior authorization by both parties. Press releases and other communications may also be issued to share the study's results. All publications and communications must be reviewed and approved by the parties involved.

Statistical analysis

Primary analysis (MRMC methodology)

The primary analysis will use the Obuchowski-Rockette (OR) method for multi-reader multi-case studies, which accounts for the crossed design where every reader evaluates every case. This method treats both readers and cases as random effects and correctly handles the correlation structure inherent in the MRMC design.

The primary endpoint — the difference in reader-averaged Top-1 diagnostic accuracy between the assisted (Stage 2) and unassisted (Stage 1) conditions, at the group level (autoimmune + genodermatoses combined) — will be estimated with 95% confidence intervals using the OR method. Statistical significance will be assessed at alpha = 0.05 (two-sided). The Hillis-Berbaum correction for small numbers of readers will be applied, given that the reader count is below 10 at the floor.

Secondary endpoints (Δ Top-3 accuracy, referral-appropriateness, per-specialty consistency) will be evaluated using the same method, assessed in hierarchical order only if the primary endpoint is met.

Secondary analyses

Each variable will be characterized using frequency distributions for qualitative variables and central-tendency statistics such as mean and median and variability statistics such as standard deviation (S.D.) or interquartile range for quantitative variables according to their distributional characteristics.

The analysis will focus on calculating diagnostic accuracy, sensitivity, and specificity for readers, both with and without the use of the device. For each of these metrics, two types of results will be analyzed:

The percentage of variation in the metric attributable to the use of the device.
The absolute value of the metric, which will be compared against the state of the art and the values obtained by the readers during the study.

Within-reader comparisons (assisted vs. unassisted) will be conducted using McNemar's test for binary outcomes. GEE models with exchangeable correlation structure will be fitted to account for within-reader clustering where appropriate. Interobserver concordance analyses will be performed by estimating the kappa coefficient.

Device-alone per-category analysis (Pillar 2)

For each ICD-11 code present in the case set, device-alone sensitivity, specificity, PPV, NPV, Top-1 / Top-3 / Top-5 accuracy, and (where applicable) malignancy-detection AUC are computed from the pre-reader device API outputs against the atlas-labelled reference standard. Results are tabled per autoimmune code and per genodermatosis code in the CIR and are pre-specified as descriptive (per-code sample sizes are typically 3–4 images per condition). The group-level aggregates (autoimmune combined, genodermatoses combined) are reported with 95 % confidence intervals and constitute the study's per-group Pillar 2 output. Rule-of-three upper bounds are applied where an ICD-11 code has zero observed errors within the MRMC case set.

Per-specialty breakdowns (dermatology / primary care / nursing) are performed as pre-specified secondary analyses to support the Secondary-3 acceptance criterion (no specialty shows significant deterioration).

Subgroup analyses by condition category (autoimmune vs. genodermatoses), and per-condition descriptive results, are performed as exploratory analyses.

Analyses will be performed using appropriate statistical software. Values of p lower than 0.05 will be considered significant.

Data management

All study data will be collected and managed through a centralized, secure web-based platform that serves as the electronic Case Report Form (eCRF). Each participating HCP will receive individual login credentials (username and password) to access the platform. The platform will present the standardized image set to each practitioner and capture their diagnostic assessments and clinical decisions in response to structured questions.

Data collection

Diagnostic assessments, referral decisions, and clinical-utility evaluations will be electronically captured directly in the platform when practitioners provide their responses. All data entries are time-stamped and automatically stored in a secure central database.

Access control and traceability

Access to the platform is restricted to authorized practitioners through individual user credentials. Complete access logs are maintained for each practitioner, documenting login times, data-entry activities, and timestamps of all interactions. This audit trail ensures full traceability of who accessed what data and when.

Data anonymity

Study images are completely anonymized and bear no patient identifiers. Each practitioner receives a unique study identification code that is not linked to their personal identity in the analysis dataset. The Principal Investigator maintains a confidential list linking practitioner identification codes to their identities, stored separately in a secure area with restricted access.

Data export and analysis

Once all data collection is completed and the database is closed and locked by the Principal Investigator, data are exported to CSV format for statistical analysis. The exported dataset contains only anonymized image data and practitioner responses, with no personal identifying information.

Data security

All data transmitted to and stored on the platform are encrypted using industry-standard security protocols. Access to the central database is restricted to authorized personnel only, and the system implements appropriate technical and organizational measures to prevent unauthorized access, alteration, or loss of data.

Current legislation will be complied with in terms of data-confidentiality protection (European Regulation 2016/679 of 27 April, on the protection of natural persons with regard to the processing of personal data and the free movement of such data, and Spanish Organic Law 3/2018 of 5 December, on the Protection of Personal Data and guarantee of digital rights).

CIP Modification

Any substantial modification to this CIP will be reviewed and approved by the Principal Investigator and the sponsor before implementation. Modifications will be documented with version number, date, and description of changes. Since this study does not require Ethics Committee or competent authority approval, modified protocols do not require resubmission to these bodies.

CIP Deviations

Any deviations from the CIP will be documented and assessed by the Principal Investigator for their impact on the integrity of the study data and the validity of the results. Deviations will be classified as minor or major and will be reported in the Clinical Investigation Report (CIR).

Start, follow-up and end reports

The start of the study will be notified to the Principal Investigator and all participant investigators.

Upon obtaining the study conclusions, a final report (T-015-006 Clinical Investigation Report (CIR)) will be prepared and submitted to the sponsor of the study.

Statements of compliance

The sponsor and the Principal Investigator commit to conducting this study in accordance with the applicable standards and regulations described in the "Compliance Statement" section. The study will be conducted in accordance with the principles of Good Clinical Practice to the extent applicable to this non-interventional, observational study design.

For patients/image subjects

Informed consent from patients whose images might appear in the atlases or public sources is not required for this study because: (1) the study utilizes completely anonymized images from public dermatological atlases and publicly available sources where individuals cannot be identified; (2) there is no active patient recruitment or direct involvement of patients in this study; (3) the images used constitute non-personal data under GDPR as they contain no information allowing identification of data subjects; and (4) the study does not involve any intervention, modification of care, or processing of sensitive personal health data.

For healthcare practitioners

While formal Informed Consent Forms are not required for practitioners (as this study is observational and non-interventional with respect to their clinical practice), all participating HCPs will receive comprehensive written and oral information about the study before commencing their participation. The information provided will include:

Study objectives and design.
Study procedures, including the three-stage per-case assessment process and expected time commitment.
The voluntary nature of participation and the right to withdraw at any time without justification.
Data-handling practices, including anonymization of practitioner responses and confidentiality protections.
Sponsor contact information for questions or concerns.

Each practitioner will sign a participation contract with the sponsor that formalizes their involvement, outlines their obligations, and confirms their understanding and voluntary agreement to participate.

Adverse events, adverse product reactions and product deficiencies

Adverse Events (AE) and Adverse Events to the Product (AEP)

An AE is any unintended medical event, unanticipated illness or injury, or unintended clinical signs (including abnormal laboratory findings) in subjects, users, or other persons, whether or not related to the investigational product and whether intended or unintended.

An AEP is an adverse event related to the use of an investigational medical device.

Given these definitions, potential AEPs or AEs are documented in the product's IFU.

Product deficiencies

Possible inadequacies of a medical device may relate to its identity, quality, durability, reliability, safety, or performance.

Product deficiencies in the investigation will be managed by the sponsor according to non-conforming product control procedures. When appropriate, corrective and/or preventive actions will be taken to protect the safety of subjects, users, and other individuals.

Serious Adverse Events, serious adverse events to the product and serious and unexpected adverse events to the product

According to UNE-EN ISO 14155:2021:

A Serious Adverse Product Reaction (SAEP) is a SAE that has produced any consequence characteristic of a serious adverse event.
A Serious Adverse Event (SAE) is an AE that results in any of the following events: death, serious deterioration of the health status of the subject, users or other persons, fetal distress, fetal death, congenital anomaly or birth defect.
A Serious Unexpected Adverse Event to the Product (SUAEP) is a SAE that, due to its nature, incidence, intensity or consequences, has not been identified in the updated risk assessment.

Taking into account these definitions, there are no SAEP, SAEs or SUAEPs related to the use of the product.

Foreseeable adverse events and adverse events to the product

The foreseeable adverse events and expected adverse reactions to the product, as well as their incidence, mitigation and treatment are documented in the T-013-002 Risk management record of the product under study.

Data Monitoring Committee (DMC)

A Data Monitoring Committee will not be established for this study. This decision is justified because: (1) the study is non-interventional and does not affect patient care or safety; (2) no patients are involved — the participants are healthcare professionals providing diagnostic assessments; (3) there are no safety endpoints to monitor; and (4) the study duration and design do not warrant interim analyses for safety or futility.

Suspension or early termination of clinical research

The sponsor may suspend or terminate the study at any time if: (1) the study objectives are unattainable; (2) significant protocol deviations compromise data integrity; or (3) unforeseen circumstances prevent the study from being conducted in accordance with this CIP. In the event of early termination, the sponsor will notify the Principal Investigator and all participating investigators, document the reasons for termination, and prepare a CIR covering all data collected up to the point of termination.

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

Author: Team members involved
Reviewer: JD-003 Design & Development Manager, JD-004 Quality Manager & PRRC
Approver: JD-001 General Manager

Scope​

CIP Identification​

Regulatory classification of this study​

Trial registration​

Compliance Statement​

Abbreviations and definitions​

CIP or protocol specifications​

Principal Investigator​

Coordinating investigator​

Collaborating Investigator(s)​

Technical Support (AI Labs Group S.L.)​

Investigational sites​

Funding​

Product Identification and Description​

Justification of the design​

Background and rationale​

Evidence-hierarchy positioning​

Pre-specified concurrent Pillar 2 (algorithm-level) analysis on the same case set​

Traceability of the group-level scope restriction into the Clinical Evaluation Plan and Clinical Evaluation Report​

Risks and benefits of the product in investigation and clinical research​

Hypothesis​

Objectives​

Primary objective​

Secondary objectives​

Exploratory objectives​

Concurrent Pillar 2 (device-alone algorithm-verification) objective​

Summary of the study​

Design and methods​

Type of clinical research​

Reference standard (ground truth)​

Population​

Sample size​

Pre-specified sample design​

Justification of the per-group floor​

Power calculation​

Pre-specified floor-sample contingency (triangulation analytic content)​

Selection of the reader pool​

Dermatological conditions and case composition​

Autoimmune dermatoses (target: ≥ 8 conditions)​

Genodermatoses (target: ≥ 6 conditions)​

Distractor conditions (pre-specified clinical mimics)​

Duration​

Acceptance criteria​

Inclusion criteria​

Exclusion criteria​

Variables​

Main variable​

Secondary variables​

Condition of interest​

Limitations of clinical research​

Ethical considerations​

Data confidentiality​

Bias minimization measures​

Calendar​

Monitoring plan​

Completion of the investigation​

Statistical analysis​

Primary analysis (MRMC methodology)​

Secondary analyses​

Device-alone per-category analysis (Pillar 2)​

Data management​

Data collection​

Access control and traceability​

Data anonymity​

Data export and analysis​

Data security​

CIP Modification​

CIP Deviations​

Start, follow-up and end reports​

Statements of compliance​

Informed Consent process​

For patients/image subjects​

For healthcare practitioners​

Adverse events, adverse product reactions and product deficiencies​

Adverse Events (AE) and Adverse Events to the Product (AEP)​

Product deficiencies​

Serious Adverse Events, serious adverse events to the product and serious and unexpected adverse events to the product​

Foreseeable adverse events and adverse events to the product​

Data Monitoring Committee (DMC)​

Suspension or early termination of clinical research​