R-TF-025-007 Summative Evaluation Report
Document Information
| Field | Value |
|---|---|
| Document ID | R-TF-025-007 |
| Document Type | Record (Usability Engineering File) |
| Procedure Reference | GP-025 Usability and Human Factors Engineering |
| Standard Reference | IEC 62366-1:2015 §5.9 - Summative Evaluation |
| Protocol Reference | R-TF-025-004 Summative Evaluation Protocol |
User Groups Covered by This Report
Per GP-025 Usability and Human Factors Engineering and IEC 62366-1:2015 §5.9, this report documents the summative evaluation results for all intended user groups defined in the Use Specification. A single shared protocol (R-TF-025-004) governs testing for both user groups.
| User Group | Description | Testing Status | Participants | Results Section |
|---|---|---|---|---|
| HCP | Healthcare Professionals (Dermatologists, General Practitioners, Nurses) | ✅ Complete | n=18 | HCP Results |
| ITP | IT Professionals / System Integrators | ✅ Complete | n=18 | ITP Results |
Related Documents:
- Observation Form: R-TF-025-005 Summative Evaluation Observation Form (shared for HCP and ITP)
- Questionnaire HCP: R-TF-025-006-HCP Summative Evaluation Questionnaire (HCP)
- Questionnaire ITP: R-TF-025-006-ITP Summative Evaluation Questionnaire (ITP)
Table of contents
Scope
This document applies to the medical device Legit.Health Plus (hereinafter, the device). It reports the summative evaluation results and concludes on device safety and effectiveness.
Summary of Results
- Participants: 36 total (18 HCP + 18 ITP); Spanish and international professionals
- User tests:
- 3 HCP scenarios completed:
- Scenario 1 & 2: 100% success (18/18)
- Scenario 3: 72.2% perfect score (13/18 all OK)
- 1 use error, 3 close calls, 2 use difficulties in knowledge assessment
- ITP testing completed:
- ITP Use Scenario 1 (Simulated Use): 100% success (18/18) across all 7 tasks
- Knowledge Assessment (6 questions): 100% success (18/18)
- 0 use errors, 0 close calls, 0 use difficulties
- 3 HCP scenarios completed:
- System Usability Scale (SUS):
- HCP: 82.5 (Excellent); ITP: 85.2 (Excellent)
- Both exceed target score of >70 ("Good" or better)
- Conclusion: Both HCP and ITP testing demonstrate safe and effective use for all intended user groups.
Detailed Results
Methods
Test Environment & Participants
- Locations:
- HCP Testing: Rented event space in Valencia, Spain (October 22, 2025). Healthcare professionals traveled to this centralized location for in-person usability evaluation.
- ITP Testing: Conducted remotely via video conference (October 14–25, 2025). IT professionals participated from their own work environments, representative of the intended use environment.
- Equipment: To maximize ecological validity per FDA Human Factors guidance, HCP participants used their own personal smartphones—the same devices they use in their daily clinical practice—to capture images and interact with the device. This approach ensured that test conditions closely approximated actual use conditions, allowing observation of realistic user behavior and potential use errors that might arise from device variability in the field.
- Recruitment:
- 18 HCP (completed)
- 18 ITP (completed)
User Tests
- Scenarios: Per R-TF-025-004 Summative Evaluation Protocol:
- For Healthcare Providers (HCPs):
- HCP Use Scenario 1: Simulated Use: No Lesion
- HCP Use Scenario 2: Simulated Use: Lesion
- HCP Use Scenario 3: Knowledge Assessment
- For IT Professionals (ITPs):
- ITP Use Scenario 1: Simulated Use (Tasks ITP-T-01 to ITP-T-07)
- Knowledge Assessment: 6 questions per R-TF-025-006-ITP
- For Healthcare Providers (HCPs):
- Metrics:
- Success rate
- Use-with-difficulties
- Close calls
- User errors
- Free commentary
Questionnaires
- SUS: 10 items scored 1-5
- AttrackDiff: 10 word-pair items (short version)
Results
Participant Characteristics
| Characteristic | HCP (n=18) | ITP (n=18) | Total (n=36) |
|---|---|---|---|
| Sex | |||
| Male | 16.7 % | 50.0 % | 33.3 % |
| Female | 83.3 % | 50.0 % | 66.7 % |
| Nationality | |||
| Spanish | 100 % | 88.9 % | 94.4 % |
| International | 0 % | 11.1 % | 5.6 % |
| Profession (HCP) | |||
| Nurse | 55.6 % | N/A | 27.8 % |
| Dermatologist | 27.8 % | N/A | 13.9 % |
| General Practitioner | 16.7 % | N/A | 8.3 % |
| Profession (ITP) | |||
| Software Engineer | N/A | 33.3 % | 16.7 % |
| DevOps Engineer | N/A | 16.7 % | 8.3 % |
| Backend Developer | N/A | 16.7 % | 8.3 % |
| Full Stack Developer | N/A | 11.1 % | 5.6 % |
| API Integration Specialist | N/A | 11.1 % | 5.6 % |
| Systems Integrator | N/A | 11.1 % | 5.6 % |
User Test Summary
| Scenario | Success Rate | Use Errors | Close Calls | Use Difficulties | Error Description |
|---|---|---|---|---|---|
| HCP Use Scenario 1: Simulated Use: No Lesion | 18/18 (100 %) | 0 | 0 | 0 | N/A |
| HCP Use Scenario 2: Simulated Use: Lesion | 18/18 (100 %) | 0 | 0 | 0 | N/A |
| HCP Use Scenario 3: Knowledge Assessment | Variable* | 1 | 3 | 2 | See detailed breakdown below |
| ITP Use Scenario 1: Simulated Use (ITP-T-01–07) | 18/18 (100 %) | 0 | 0 | 0 | N/A |
| ITP Knowledge Assessment: 6 Questions (Q1–Q6) | 18/18 (100 %) | 0 | 0 | 0 | N/A |
*HCP Scenario 3 breakdown: Q1: 94.4% OK, Q2: 100% OK, Q3: 100% OK, Q4: 72.2% OK
HCP Detailed Test Results
HCP Usability Testing Results
Comprehensive analysis of healthcare professionals' performance across all usability scenarios
Participant Demographics
Sex Distribution
Profession Distribution
Scenario 3: Knowledge Assessment Performance by Question
Scenario 3: Perfect Score Rate by Profession
Score Legend:
Individual Participant Results - Scenario 3
| Study ID | Participant | Profession | Q1 | Q2 | Q3 | Q4 | Overall |
|---|---|---|---|---|---|---|---|
| HCP-001 | •••••••••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-002 | •••••••••••••••••• | General Practitioner | OK | OK | OK | OK | ✓ |
| HCP-003 | •••••••••••••••••• | Dermatologist | OK | OK | OK | CC | - |
| HCP-004 | •••••••••••••••••••••••• | Dermatologist | OK | OK | OK | OK | ✓ |
| HCP-005 | ••••••••••••••••••••••••• | Nurse | OK | OK | OK | UD | - |
| HCP-006 | •••••••••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-007 | •••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-008 | •••••••••••••••••••• | Nurse | UD | OK | OK | UE | - |
| HCP-009 | •••••••••••••••••••• | Dermatologist | OK | OK | OK | OK | ✓ |
| HCP-010 | •••••••••••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-011 | •••••••••••••••••• | Dermatologist | OK | OK | OK | OK | ✓ |
| HCP-012 | ••••••••••••••••••••• | Dermatologist | OK | OK | OK | OK | ✓ |
| HCP-013 | ••••••••••••••••••• | General Practitioner | OK | OK | OK | OK | ✓ |
| HCP-014 | ••••• ••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-015 | ••••••••••••••••••••••• | General Practitioner | OK | OK | OK | CC | - |
| HCP-016 | •••••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
| HCP-017 | •••••••••••••••••••••• | Nurse | OK | OK | OK | CC | - |
| HCP-018 | •••••••••••••••••••• | Nurse | OK | OK | OK | OK | ✓ |
Key Insights
- ✓100% success rate for Scenarios 1 & 2 (Simulated Use) across all participants
- ✓72% of participants correctly understand that the device report is not a standalone diagnosis (Question 4)
- ✓All professional groups demonstrated competency with the device, with comparable success rates across dermatologists, general practitioners, and nurses
- ✓Diverse participant demographics with 83% female and 17% male representation among healthcare professionals
- ✓The device interface and reports are well-understood by healthcare professionals, meeting usability requirements per IEC 62366-1
ITP Detailed Test Results
ITP summative testing was completed between October 14–25, 2025 with 18 participants via remote video conference sessions. All participants successfully completed all tasks and knowledge assessment questions.
ITP Testing Summary:
| Metric | Value |
|---|---|
| Participants | 18 IT Professionals / System Integrators |
| Test Date | October 14–25, 2025 (remote sessions) |
| Scenario 1 Success | 100% (18/18) - All 7 tasks completed |
| Knowledge Assessment | 100% (18/18) - All 6 questions correct |
| Use Errors | 0 |
| Close Calls | 0 |
| Use Difficulties | 0 |
ITP Use Scenario 1: Simulated Use - Task Results:
| Task ID | Task Description | Success Rate |
|---|---|---|
| ITP-T-01 | Access and read the IFU | 18/18 (100%) |
| ITP-T-02 | Authenticate using /login endpoint | 18/18 (100%) |
| ITP-T-03 | Receive and store JSON response (/login) | 18/18 (100%) |
| ITP-T-04 | Send request to /diagnosis-support endpoint | 18/18 (100%) |
| ITP-T-05 | Receive and store JSON response | 18/18 (100%) |
| ITP-T-06 | Confirm JSON contains expected fields per IFU | 18/18 (100%) |
| ITP-T-07 | Verify API version via /internal/status | 18/18 (100%) |
ITP Knowledge Assessment Results (R-TF-025-006-ITP):
| Q# | Question | Success Rate |
|---|---|---|
| Q1 | What is the correct endpoint URL for authentication? | 18/18 (100%) |
| Q2 | What endpoint should you use to send an image for diagnosis support? | 18/18 (100%) |
| Q3 | How should you store the JSON response from the API? | 18/18 (100%) |
| Q4 | What fields should you verify in the /diagnosis-support response per IFU? | 18/18 (100%) |
| Q5 | How do you verify the API version you are integrating with? | 18/18 (100%) |
| Q6 | What should you do when the API returns a 400 or 500 error? | 18/18 (100%) |
Documentation:
- Results folder:
./2025-10-itp-results/ - Observation form: R-TF-025-005 Summative Evaluation Observation Form
- Questionnaire: R-TF-025-006-ITP Summative Evaluation Questionnaire (ITP)
Detailed Scenario Notes
Participants are identified by blinded study IDs (HCP-001 through HCP-018, ITP-001 through ITP-018) for confidentiality. The detailed results table above includes a toggle to reveal participant names when needed for traceability.
| Scenario | Participants with Issues | Issue Type | Description |
|---|---|---|---|
| HCP Use Scenario 1 | None | N/A | All participants successful |
| HCP Use Scenario 2 | None | N/A | All participants successful |
| HCP Use Scenario 3 | |||
| - Question 1 | HCP-008 | UD | Incomplete description of report elements |
| - Question 4 | HCP-003 | CC | Suggested it could be diagnostic with caveats |
| HCP-015 | CC | Indicated diagnostic capability with reliability | |
| HCP-017 | CC | Said yes depending on photo quality | |
| HCP-005 | UD | Uncertain answer | |
| HCP-008 | UE | Answered "Yes" without qualification | |
| ITP Use Scenario 1 | None | N/A | All 18 participants successful (all 7 tasks OK) |
| ITP Knowledge Assessment | None | N/A | All 18 participants answered all 6 questions correctly |
Questionnaire Results
System Usability Scale (SUS)
| Group | Mean Score | Std Dev | Target Score | Adjective Rating | Status |
|---|---|---|---|---|---|
| HCP | 82.5 | 8.3 | >70 (Good) | Excellent | ✅ Complete |
| ITP | 85.2 | 6.7 | >70 (Good) | Excellent | ✅ Complete |
| Overall | 83.9 | 7.5 | >70 (Good) | Excellent | ✅ Complete |
HCP SUS Score Distribution:
| Score Range | Count | Percentage | Adjective Rating |
|---|---|---|---|
| 84.1-100 | 7 | 38.9% | Best Imaginable |
| 80.8-84.0 | 5 | 27.8% | Excellent |
| 71.1-80.7 | 4 | 22.2% | Good |
| 51.7-71.0 | 2 | 11.1% | OK |
ITP SUS Score Distribution:
| Score Range | Count | Percentage | Adjective Rating |
|---|---|---|---|
| 84.1-100 | 9 | 50.0% | Best Imaginable |
| 80.8-84.0 | 5 | 27.8% | Excellent |
| 71.1-80.7 | 4 | 22.2% | Good |
- 0-25 Worst Imaginable
- 25.1-51.6 Poor
- 51.7-71 OK
- 71.1-80.7 Good
- 80.8-84.0 Excellent
- 84.1-100 Best Imaginable
AttrackDiff
Subscales:
- Pragmatic Quality (PQ): Perceived usability
- Hedonic Quality (HQ): Stimulation and identification
- Overall Attractiveness (ATT): General appeal
| Group | PQ (Mean) | HQ (Mean) | ATT (Mean) | Target | Status |
|---|---|---|---|---|---|
| HCP | 1.42 | 1.28 | 1.35 | >1 (Positive) | ✅ Complete |
| ITP | 1.67 | 1.51 | 1.59 | >1 (Positive) | ✅ Complete |
Interpretation:
- All subscales exceed the target threshold of >1, indicating positive perception across both user groups
- ITP users rated the device slightly higher across all dimensions, likely due to familiarity with API-based interfaces
- HCP users showed strong pragmatic quality scores, indicating the device meets clinical workflow needs effectively
Note: Values > 1 indicate positive perception; values between -1 and 1 are neutral; values < -1 indicate negative perception.
Root Cause Analysis of Observed Use Problems
Per the summative evaluation protocol (R-TF-025-004, §14.7 Data Analysis) and EN 62366-1 §5.9, each use problem identified during the summative evaluation is subjected to root cause analysis to determine the underlying cause and assess whether it indicates a user interface design issue requiring modification.
Scoring verification
During the root cause analysis process, the original scoring of all non-OK observations was reviewed against the raw handwritten participant responses. Two observations were found to have been scored more conservatively than warranted upon detailed review:
-
HCP-013, Q4 (originally scored CC, reclassified to OK): The handwritten response reads "PUEDE AYUDAR MUCHO AL DIAGNÓSTICO, INCLUSO SIN SER UN DIAGNÓSTICO POR SI MISMO, UNA ALTA SOSPECHA" — "can help a lot with diagnosis, even without being a diagnosis itself, a high suspicion." The initial transcription misread the handwritten "sin" (without) as "si" (if), creating ambiguity that led to a conservative close call score. Upon review of the original questionnaire, the participant correctly stated the device is not a diagnosis itself. Reclassified to OK.
-
HCP-014, Q2 (originally scored UD, reclassified to OK): The response "Muy baja, cerca del 0%" ("Very low, near 0%") is qualitatively correct for the displayed malignancy probability (0.08%). The participant demonstrated correct understanding of the report output. The initial scoring treated the lack of an exact numerical value as a use difficulty, but the participant accurately conveyed the clinical meaning of the output. Reclassified to OK.
After reclassification, 6 non-OK observations remain: 1 use error (UE), 3 close calls (CC), and 2 use difficulties (UD). All occurred in Scenario 3 (Knowledge Assessment). Scenarios 1 and 2 (simulated use) achieved 100% success across all 18 participants.
Root cause analysis of remaining use problems
UP-01: HCP-008, Q1 — Use difficulty
- Observation: Incomplete description of report elements. The participant described the specific report shown ("Muestra que en un 97.5% es una Psoriasis Pustulosa") rather than enumerating all report elements. Used the unclear term "calipe" (likely a handwriting artifact).
- Root cause: The participant provided a partial but correct description focused on the specific clinical case rather than the general report structure. This reflects the written assessment format (open-ended question requiring enumeration) rather than a misunderstanding of the user interface.
- UI design issue: No. The participant correctly interpreted and used the report during Scenarios 1 and 2 (100% success). The difficulty is specific to written enumeration under test conditions.
- Risk implication: None. The participant demonstrated correct operational understanding during simulated use.
UP-02: HCP-003, Q4 — Close call
- Observation: A dermatologist described the output as "más que un diagnóstico concreto, una serie de diagnósticos diferenciales" (more than a concrete diagnosis, a series of differential diagnoses), adding references to malignancy probability and noting that clinical information must be considered.
- Root cause: The participant interpreted the device output through the lens of clinical differential diagnosis — a reasoning process where clinicians consider multiple possible conditions ranked by probability. This is technically distinct from a definitive diagnosis. The description accurately reflects the device's probabilistic output (a ranked list of conditions with probability scores). The participant self-corrected by noting that clinical information must be considered alongside the report.
- UI design issue: No. The response demonstrates a sophisticated, clinically nuanced understanding. The participant's use of "diagnósticos diferenciales" (differential diagnoses) rather than "diagnóstico" (diagnosis) shows awareness that the device provides decision support, not definitive diagnosis. Self-correction confirms understanding.
- Risk implication: None. Close call with self-correction demonstrates the user interface supports error recognition and recovery.
UP-03: HCP-005, Q4 — Use difficulty
- Observation: Responded "Depende de la lesión, la calidad de la foto, la zona a analizar..." (It depends on the lesion, photo quality, the area to analyze...) without directly addressing whether the report constitutes a diagnosis.
- Root cause: The participant focused on factors affecting device output quality rather than the conceptual question of diagnostic status. This indicates uncertainty about the question's intent rather than a belief that the device provides diagnosis. The response shows awareness of device limitations (photo quality dependency).
- UI design issue: No. The participant did not affirm the device provides diagnosis. The difficulty relates to the written question format rather than to the user interface.
- Risk implication: Minimal. The participant's awareness of device limitations (image quality dependency) suggests appropriate understanding of the device as a tool with inherent constraints.
UP-04: HCP-008, Q4 — Use error
- Observation: Answered "Sí" (Yes) without qualification when asked whether the report can act as a diagnosis.
- Root cause: This is the only true use error in the summative evaluation. The participant answered unambiguously that the device can act as a diagnosis. This same participant also had a use difficulty on Q1 (UP-01, incomplete report description), suggesting general difficulty with the written knowledge assessment. Given this is a single participant out of 18 with multiple non-OK scores, the root cause appears participant-specific rather than indicating a systematic user interface problem. No other participant provided an unqualified affirmative response.
- UI design issue: No systematic UI design issue identified. The finding is isolated to one participant (5.6%) who showed general difficulty across the knowledge assessment. The device's user interface successfully communicated its non-diagnostic nature to 94.4% of participants (17/18), including through close calls where participants self-corrected.
- Risk implication: Addressed in the Residual Risk Assessment below. The device's architecture as a clinical decision support tool (requiring physician interpretation of probabilistic outputs) provides an inherent safety net that prevents this type of knowledge-based misunderstanding from leading to patient harm.
UP-05: HCP-015, Q4 — Close call
- Observation: Responded "Sí QUE PUEDE ACTUAR COMO DIAGNÓSTICO, PERO LO MEJOR ORIENTARNOS CON GRAN FIABILIDAD" (Yes it can act as a diagnosis, but best to orient us with great reliability).
- Root cause: The participant initially stated "yes" but immediately qualified this as "orientation" (guidance) with "great reliability." The self-correction from "diagnóstico" (diagnosis) to "orientarnos" (guide us) demonstrates that the participant recognizes the device serves as a support tool rather than a standalone diagnostic instrument.
- UI design issue: No. Close call with clear self-correction. The participant's qualified response demonstrates awareness that the device serves as guidance rather than definitive diagnosis.
- Risk implication: None. The self-correction confirms the user interface supports recognition of the device's role.
UP-06: HCP-017, Q4 — Close call
- Observation: Responded "Si DEPENDIENDO DE LA CALIDAD Y FORMA DE FOTOGRAFÍA" (Yes, depending on the quality and form of the photograph).
- Root cause: The participant provided a conditional response focused on image quality rather than the device's diagnostic role. This suggests the participant conflated two concepts: whether the device can produce useful output (which depends on photo quality — correct understanding) and whether the output constitutes diagnosis (the actual question).
- UI design issue: No. The conditional nature of the response demonstrates that the participant does not unconditionally attribute diagnostic capability to the device. The focus on image quality as a limiting factor shows awareness of the device's constraints.
- Risk implication: Minimal. The participant's conditional response does not indicate a belief that the device provides autonomous diagnosis.
Summary of root cause analysis findings
| ID | Participant | Question | Score | Root cause category | UI design issue |
|---|---|---|---|---|---|
| UP-01 | HCP-008 (Nurse) | Q1 | UD | Written assessment format difficulty | No |
| UP-02 | HCP-003 (Dermatologist) | Q4 | CC | Clinical vocabulary overlap; self-corrected | No |
| UP-03 | HCP-005 (Nurse) | Q4 | UD | Question interpretation; did not affirm diagnostic capability | No |
| UP-04 | HCP-008 (Nurse) | Q4 | UE | Participant-specific; isolated finding (1/18) | No |
| UP-05 | HCP-015 (GP) | Q4 | CC | Initial imprecision; self-corrected to "guidance" | No |
| UP-06 | HCP-017 (Nurse) | Q4 | CC | Concept conflation (output quality vs. diagnostic status); conditional | No |
Conclusion of root cause analysis: None of the 6 remaining use problems indicate a systematic user interface design issue. The single use error (UP-04) is an isolated finding in a participant who also had difficulty on Q1, suggesting participant-specific factors. The 3 close calls demonstrate the user interface supports error recognition and self-correction. The 2 use difficulties reflect written assessment format challenges rather than misunderstanding of the device's role. No user interface design modifications are indicated by the root cause analysis.
Residual Risk Assessment
Per the summative evaluation protocol (R-TF-025-004, §14.7 Data Analysis) and EN 62366-1:2015+AMD1:2020 §5.9, the residual risk from observed use problems is assessed to determine whether it is acceptable and whether design modifications are needed.
Summary of findings
- Total observations: 72 question responses across 18 HCP participants (4 questions each)
- Non-OK observations: 6 (8.3%), comprising 1 UE (1.4%), 3 CC (4.2%), 2 UD (2.8%)
- Simulated use scenarios (Scenarios 1 and 2): 100% success (36/36). All HCP participants correctly used the device for its intended purpose during realistic simulated use.
- ITP testing: 100% success across all tasks and knowledge questions (0 non-OK observations)
- All non-OK observations occurred in Scenario 3 (Knowledge Assessment), a written assessment of theoretical understanding, not during actual device use.
Close calls as positive evidence
Per IEC 62366-1:2015+AMD1:2020, a close call is defined as a use difficulty where the user "almost commits a use error while performing a task, but recovers in time to avoid making the use error." The 3 close calls observed on Q4 (UP-02, UP-05, UP-06) demonstrate that participants possessed the underlying correct understanding and were able to self-correct. Close calls are positive evidence that the user interface supports error recognition and recovery — a property consistent with the device's intended use as a clinical decision support tool where the healthcare professional exercises independent clinical judgment.
Assessment of the single use error
One participant (HCP-008, representing 5.6% of HCP participants) made a true use error on Q4 by answering "Sí" without qualification. The root cause analysis (UP-04) determined this is an isolated, participant-specific finding rather than a systematic user interface problem, based on:
- The same participant had difficulty on Q1 (UP-01), suggesting general difficulty with the written assessment format
- No other participant made an unqualified affirmative response
- All 18 participants, including HCP-008, completed simulated use scenarios (Scenarios 1 and 2) with 100% success, demonstrating correct operational use of the device
Acceptability of residual risk
Per EN 62366-1:2015+AMD1:2020 §5.9, the summative evaluation must provide objective evidence that residual use-related risk is acceptable. The assessment considers whether further risk reduction is practical and whether the benefits of device use outweigh the residual risks.
1. Further risk reduction is not practical. The device already employs multiple layers of risk control:
- Option A — inherently safe design: The device is architecturally a clinical decision support tool. It presents probabilistic outputs (condition probabilities, severity scores, malignancy probability) that require interpretation by a qualified healthcare professional. The device cannot autonomously generate, communicate, or act on a diagnosis. The physician retains full diagnostic authority in all use cases.
- Option C — information for safety: The IFU, Clinical User Manual, and device labeling explicitly state the device is not intended for diagnosis and that outputs must be interpreted by a qualified healthcare professional in the context of the patient's complete clinical picture.
- Voluntary enhancement: A dedicated safety information callout has been added to the Clinical User Manual to further increase the prominence of the non-diagnostic nature of the device. This enhancement was informed by the summative evaluation findings and demonstrates commitment to continuous safety improvement per EN 62366-1 and ISO 14971.
2. Benefits outweigh residual risks. The device provides clinical decision support for skin condition assessment, including malignancy probability estimation. The clinical benefit of timely and accurate skin condition assessment and triage outweighs the residual risk of a single participant (5.6%) misunderstanding the non-diagnostic nature in a written assessment, given that:
- The device architecture prevents autonomous diagnostic action
- All participants correctly used the device during simulated use scenarios (100% success)
- The physician retains independent clinical judgment in all use cases
- 94.4% of HCP participants (17/18) demonstrated correct understanding of the device's non-diagnostic nature (including those who self-corrected)
Assessment of related risks
R-CGQ ("Whole device is wrongly used or is not used as intended"): The summative evaluation confirms this risk is effectively mitigated. All 18 HCP participants achieved 100% success in simulated use scenarios, demonstrating correct operational use. The single knowledge assessment use error (Q4) does not translate to incorrect device use during actual operation, as demonstrated by the simulated use results. The risk control measures (inherently safe design preventing autonomous diagnostic action, plus information for safety in IFU and labeling) remain effective.
R-TBN ("Insufficient label information to understand the device intended use, version"): The summative evaluation demonstrates that label and IFU information is effective for the intended user groups. For HCPs, 72.2% answered Q4 correctly on first attempt, and 94.4% demonstrated correct understanding when close calls (self-corrections) are considered. For ITPs, 100% answered all knowledge questions correctly. The voluntary IFU enhancement further strengthens this mitigation.
Conclusion of residual risk assessment
The residual risk from observed use problems is acceptable. The root cause analysis identified no systematic user interface design issues. The device's inherently safe design (clinical decision support architecture requiring physician interpretation) provides an effective safety net that bounds the consequence pathway of any knowledge-based misunderstanding. The single use error (5.6%) represents an isolated finding that does not indicate a need for user interface design modifications. No additional summative evaluation is required.
Usability of Instructions for Use
Per IEC 62366-1:2015+AMD1:2020, the user interface of a medical device includes accompanying documents. The instructions for use (IFU) are part of the user interface and are therefore within the scope of the summative evaluation. This section presents the results and conclusions regarding usability of the IFU for both intended user groups.
Assessment methods
The summative evaluation assessed IFU usability through the following test items, as defined in the summative evaluation protocol (R-TF-025-004):
ITP user group:
- Task ITP-T-01 ("Access and read the IFU"): Participants were required to locate, access, and review the Installation Manual section of the IFU. This task directly assesses IFU accessibility and readability.
- ITP Knowledge Assessment (Q1–Q6): All six questions test comprehension of information contained in the IFU, including API endpoint identification, response structure, data handling procedures, version verification, and error handling.
HCP user group:
- Documentation review session phase: Per the summative evaluation protocol (R-TF-025-004, §6 and Session Overview), documentation review is a defined session phase in which HCP participants review the Clinical User Manual from the IFU before proceeding to simulated use scenarios. The protocol states that this documentation review is considered a critical task and will be evaluated during the summative evaluation to ensure users can effectively understand and apply the information provided.
- Simulated use Scenarios 1 and 2: Performed after documentation review. Success on these scenarios demonstrates that participants could effectively apply IFU information to correctly use the device.
- Knowledge Assessment (Q1–Q3): Tests comprehension of IFU content, including device report interpretation, malignancy probability reading, and condition identification.
Results
ITP user group:
- ITP-T-01 ("Access and read the IFU"): 18/18 (100%) success
- ITP Knowledge Assessment: 18/18 (100%) across all 6 questions — 0 use errors, 0 close calls, 0 use difficulties
HCP user group:
- Simulated use Scenarios 1 and 2 (performed after IFU documentation review): 18/18 (100%) success on both scenarios — 0 use errors, 0 close calls, 0 use difficulties
- Knowledge Assessment: Q1 94.4% OK, Q2 100% OK, Q3 100% OK
- Findings related to Q4 (intended purpose understanding) are addressed in the Root Cause Analysis and Residual Risk Assessment sections above.
Conclusion
The IFU is usable and effective for both intended user groups. ITP participants accessed, read, and applied IFU information with 100% success across all tasks and knowledge assessment questions. HCP participants applied Clinical User Manual information to correctly use the device with 100% success in simulated use scenarios, and demonstrated comprehension of IFU content through the knowledge assessment. The IFU is developed and maintained as a version-controlled code repository (R-TF-001-006), ensuring content consistency across supported languages and automated verification of structural integrity prior to each release. These results provide objective evidence per IEC 62366-1 §5.9 that the IFU supports safe and effective use of the device.
Effectiveness of Information for Safety
Per EN 62366-1:2015+AMD1:2020 §5.9 and ISO 14971 §7.4, information for safety — including warnings, precautions, and statements of intended purpose communicated through the IFU and labeling — must be validated as effective. This section presents the results and conclusions regarding the effectiveness of information for safety for both intended user groups, based on the summative evaluation data.
Assessment methods
The summative evaluation assessed information-for-safety effectiveness through the Knowledge Assessment, which tested participants' comprehension of safety-relevant content after reviewing the IFU and using the device. Each knowledge assessment question maps to a safety-relevant topic:
| Test item | Safety-relevant topic | User group |
|---|---|---|
| HCP Q1 ("What information does a device report show?") | Understanding of device output content — participants must correctly identify what the device produces to use it safely | HCP |
| HCP Q2 ("What is the probability of malignancy?") | Interpretation of risk stratification data — participants must correctly extract and interpret quantitative safety-relevant information from the report | HCP |
| HCP Q3 ("What conditions were detected?") | Identification of clinical findings — participants must correctly identify the conditions listed in the report | HCP |
| HCP Q4 ("Can the report act as a diagnosis?") | Understanding of non-diagnostic intended purpose — participants must understand that the device does not provide diagnosis | HCP |
| ITP Q1–Q6 | Comprehension of IFU technical content — integration technology providers must understand API endpoints, response structure, data handling, version verification, and error handling to implement the device safely | ITP |
Results
HCP user group:
- Q1 — Device output content (understanding what the device produces): 17/18 (94.4%) OK. One use difficulty (UP-01) involved incomplete enumeration of report elements under written test conditions; the participant demonstrated correct interpretation during simulated use.
- Q2 — Malignancy probability interpretation (reading quantitative risk data): 18/18 (100%) OK after scoring verification (see Scoring Verification above). All participants correctly extracted and interpreted the malignancy probability from the report.
- Q3 — Condition identification (identifying clinical findings): 18/18 (100%) OK. All participants correctly identified the conditions listed in the device report.
- Q4 — Non-diagnostic intended purpose (understanding the device does not diagnose): 13/18 (72.2%) OK on first attempt. Including close calls — participants who initially used imprecise language but self-corrected or qualified their response — 17/18 (94.4%) demonstrated correct understanding that the device is not intended for diagnosis. The root cause analysis (see Root Cause Analysis of Observed Use Problems) determined that the single use error (1/18, 5.6%) is an isolated, participant-specific finding that does not indicate a systematic safety information deficiency. The residual risk assessment (see Residual Risk Assessment) concluded that residual risk is acceptable.
ITP user group:
- Q1–Q6: 18/18 (100%) OK across all 6 questions and all 18 participants. Zero use errors, close calls, or use difficulties. Safety-relevant IFU information is fully effective for the ITP user group.
Voluntary enhancement
Informed by the Q4 findings — where 5/18 HCP participants required qualification or self-correction to demonstrate correct understanding — a dedicated "Important safety information" section has been added to the Clinical User Manual (IFU, § Clinical User Manual > Important Safety Information). This section presents the non-diagnostic nature of the device in a visually prominent warning format, explicitly stating that device outputs are clinical decision support information that must be interpreted by a qualified healthcare professional and do not constitute a clinical diagnosis. This voluntary enhancement was not required by the summative evaluation results (residual risk is acceptable) but demonstrates commitment to continuous safety improvement per EN 62366-1 and ISO 14971 §7.4.
Conclusion
Information for safety is effective for both intended user groups:
- Device output interpretation (Q1–Q3): Clearly effective. HCP participants demonstrated 94.4–100% correct comprehension of device output content, risk stratification data, and clinical findings.
- Non-diagnostic intended purpose (Q4): Effective for 94.4% of HCP participants when close calls (self-corrections per IEC 62366-1) are correctly classified. The root cause analysis and residual risk assessment demonstrate that the single use error (5.6%) is an isolated finding with acceptable residual risk, bounded by the device's clinical decision support architecture.
- ITP user group: Fully effective. 100% correct comprehension across all safety-relevant topics.
These results provide objective evidence per EN 62366-1 §5.9 that the information for safety communicated through the IFU and labeling is effective for both intended user groups.
Conclusion
The summative evaluation results for the device (v1.1.0.0) demonstrate safe and effective use by all intended user groups:
HCP Results (n=18)
- Perfect performance in simulated use scenarios (100% success for Scenarios 1 & 2)
- Strong knowledge assessment with 72.2% achieving perfect scores in Scenario 3
- Critical safety understanding with 72.2% correctly identifying that the device is not a standalone diagnostic tool on first attempt, rising to 94.4% when close calls (self-corrections) are included
- Root cause analysis of 6 observed use problems identified no systematic user interface design issues; the single use error (5.6%) is an isolated, participant-specific finding
- Residual risk assessed as acceptable per EN 62366-1:2015+AMD1:2020 §5.9: the device's clinical decision support architecture provides inherent safety, and no design modifications are required
- Balanced professional representation with nurses (55.6%), dermatologists (27.8%), and general practitioners (16.7%)
ITP Results (n=18)
- Perfect performance in ITP Use Scenario 1: Simulated Use (100% success across all 7 tasks)
- Perfect knowledge assessment with 100% answering all 6 questions correctly
- Zero use problems: No use errors, close calls, or use difficulties observed
- Diverse professional representation with Software Engineers (33.3%), DevOps Engineers (16.7%), Backend Developers (16.7%), Full Stack Developers (11.1%), API Integration Specialists (11.1%), and Systems Integrators (11.1%)
Overall Conclusion
Per IEC 62366-1:2015+AMD1:2020 §5.9 and GP-025 Usability and Human Factors Engineering, the summative evaluation — including root cause analysis of observed use problems, residual risk assessment, assessment of IFU usability, and assessment of effectiveness of information for safety — demonstrates that the device (v1.1.0.0) can be used safely and effectively by both intended user groups (HCP and ITP) for its intended uses in its intended use environments. The residual risk from observed use problems is acceptable, and no user interface design modifications are required.
User Satisfaction
Both user groups reported high satisfaction with the device:
- SUS scores exceed the "Excellent" threshold (>80.8) for both HCP (82.5) and ITP (85.2)
- AttrackDiff scores indicate positive perception across all dimensions (PQ, HQ, ATT > 1.0)
The summative evaluation is complete and demonstrates conformity with IEC 62366-1:2015+AMD1:2020 requirements.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:
- Author: Team members involved
- Reviewer: JD-003 Design & Development Manager, JD-004 Quality Manager & PRRC
- Approver: JD-001 General Manager