Skip to main content
QMSQMS
QMS
  • Welcome to your QMS
  • Quality Manual
  • Procedures
  • Records
  • Legit.Health Plus Version 1.1.0.0
    • CAPA Plan - BSI CE Mark Closeout
    • Index
    • Overview and Device Description
    • Information provided by the Manufacturer
    • Design and Manufacturing Information
    • GSPR
    • Benefit-Risk Analysis and Risk Management
    • Product Verification and Validation
      • Software
      • Artificial Intelligence
      • Cybersecurity
      • Usability and Human Factors Engineering
        • Usability Testing Documentation Guide
        • R-TF-025-001 Usability plan
        • R-TF-025-002 Identification of characteristics for safety and possible use errors
        • R-TF-025-003 User interface evaluation plan
        • R-TF-025-004 Summative evaluation protocol
        • R-TF-025-005 Summative Evaluation Observation Form
        • R-TF-025-006 Summative Evaluation Questionnaire
        • R-TF-025-007 Summative Evaluation Report
        • Deprecated
      • Clinical
      • Commissioning
    • Post-Market Surveillance
  • Legit.Health Plus Version 1.1.0.1
  • Legit.Health Utilities
  • Licenses and accreditations
  • Applicable Standards and Regulations
  • BSI Non-Conformities
  • Pricing
  • Public tenders
  • Legit.Health Plus Version 1.1.0.0
  • Product Verification and Validation
  • Usability and Human Factors Engineering
  • R-TF-025-007 Summative Evaluation Report

R-TF-025-007 Summative Evaluation Report

Document Information​

FieldValue
Document IDR-TF-025-007
Document TypeRecord (Usability Engineering File)
Procedure ReferenceGP-025 Usability and Human Factors Engineering
Standard ReferenceIEC 62366-1:2015 §5.9 - Summative Evaluation
Protocol ReferenceR-TF-025-004 Summative Evaluation Protocol

User Groups Covered by This Report​

Shared Summative Evaluation Report

Per GP-025 Usability and Human Factors Engineering and IEC 62366-1:2015 §5.9, this report documents the summative evaluation results for all intended user groups defined in the Use Specification. A single shared protocol (R-TF-025-004) governs testing for both user groups.

User GroupDescriptionTesting StatusParticipantsResults Section
HCPHealthcare Professionals (Dermatologists, General Practitioners, Nurses)✅ Completen=18HCP Results
ITPIT Professionals / System Integrators✅ Completen=18ITP Results

Related Documents:

  • Observation Form: R-TF-025-005 Summative Evaluation Observation Form (shared for HCP and ITP)
  • Questionnaire HCP: R-TF-025-006-HCP Summative Evaluation Questionnaire (HCP)
  • Questionnaire ITP: R-TF-025-006-ITP Summative Evaluation Questionnaire (ITP)
Table of contents
  • Document Information
  • User Groups Covered by This Report
  • Scope
  • Summary of Results
  • Detailed Results
    • Methods
      • Test Environment & Participants
      • User Tests
      • Questionnaires
    • Results
      • Participant Characteristics
      • User Test Summary
      • HCP Detailed Test Results
      • ITP Detailed Test Results
      • Detailed Scenario Notes
    • Questionnaire Results
      • System Usability Scale (SUS)
      • AttrackDiff
  • Root Cause Analysis of Observed Use Problems
    • Scoring verification
    • Root cause analysis of remaining use problems
      • UP-01: HCP-008, Q1 — Use difficulty
      • UP-02: HCP-003, Q4 — Close call
      • UP-03: HCP-005, Q4 — Use difficulty
      • UP-04: HCP-008, Q4 — Use error
      • UP-05: HCP-015, Q4 — Close call
      • UP-06: HCP-017, Q4 — Close call
    • Summary of root cause analysis findings
  • Residual Risk Assessment
    • Summary of findings
    • Close calls as positive evidence
    • Assessment of the single use error
    • Acceptability of residual risk
    • Assessment of related risks
    • Conclusion of residual risk assessment
  • Usability of Instructions for Use
    • Assessment methods
    • Results
    • Conclusion
  • Effectiveness of Information for Safety
    • Assessment methods
    • Results
    • Voluntary enhancement
    • Conclusion
  • Conclusion
    • HCP Results (n=18)
    • ITP Results (n=18)
    • Overall Conclusion
    • User Satisfaction

Scope​

This document applies to the medical device Legit.Health Plus (hereinafter, the device). It reports the summative evaluation results and concludes on device safety and effectiveness.

Summary of Results​

  • Participants: 36 total (18 HCP + 18 ITP); Spanish and international professionals
  • User tests:
    • 3 HCP scenarios completed:
      • Scenario 1 & 2: 100% success (18/18)
      • Scenario 3: 72.2% perfect score (13/18 all OK)
      • 1 use error, 3 close calls, 2 use difficulties in knowledge assessment
    • ITP testing completed:
      • ITP Use Scenario 1 (Simulated Use): 100% success (18/18) across all 7 tasks
      • Knowledge Assessment (6 questions): 100% success (18/18)
      • 0 use errors, 0 close calls, 0 use difficulties
  • System Usability Scale (SUS):
    • HCP: 82.5 (Excellent); ITP: 85.2 (Excellent)
    • Both exceed target score of >70 ("Good" or better)
  • Conclusion: Both HCP and ITP testing demonstrate safe and effective use for all intended user groups.

Detailed Results​

Methods​

Test Environment & Participants​

  • Locations:
    • HCP Testing: Rented event space in Valencia, Spain (October 22, 2025). Healthcare professionals traveled to this centralized location for in-person usability evaluation.
    • ITP Testing: Conducted remotely via video conference (October 14–25, 2025). IT professionals participated from their own work environments, representative of the intended use environment.
  • Equipment: To maximize ecological validity per FDA Human Factors guidance, HCP participants used their own personal smartphones—the same devices they use in their daily clinical practice—to capture images and interact with the device. This approach ensured that test conditions closely approximated actual use conditions, allowing observation of realistic user behavior and potential use errors that might arise from device variability in the field.
  • Recruitment:
    • 18 HCP (completed)
    • 18 ITP (completed)

User Tests​

  • Scenarios: Per R-TF-025-004 Summative Evaluation Protocol:
    • For Healthcare Providers (HCPs):
      • HCP Use Scenario 1: Simulated Use: No Lesion
      • HCP Use Scenario 2: Simulated Use: Lesion
      • HCP Use Scenario 3: Knowledge Assessment
    • For IT Professionals (ITPs):
      • ITP Use Scenario 1: Simulated Use (Tasks ITP-T-01 to ITP-T-07)
      • Knowledge Assessment: 6 questions per R-TF-025-006-ITP
  • Metrics:
    • Success rate
    • Use-with-difficulties
    • Close calls
    • User errors
    • Free commentary

Questionnaires​

  • SUS: 10 items scored 1-5
  • AttrackDiff: 10 word-pair items (short version)

Results​

Participant Characteristics​

CharacteristicHCP (n=18)ITP (n=18)Total (n=36)
Sex
Male16.7 %50.0 %33.3 %
Female83.3 %50.0 %66.7 %
Nationality
Spanish100 %88.9 %94.4 %
International0 %11.1 %5.6 %
Profession (HCP)
Nurse55.6 %N/A27.8 %
Dermatologist27.8 %N/A13.9 %
General Practitioner16.7 %N/A8.3 %
Profession (ITP)
Software EngineerN/A33.3 %16.7 %
DevOps EngineerN/A16.7 %8.3 %
Backend DeveloperN/A16.7 %8.3 %
Full Stack DeveloperN/A11.1 %5.6 %
API Integration SpecialistN/A11.1 %5.6 %
Systems IntegratorN/A11.1 %5.6 %

User Test Summary​

ScenarioSuccess RateUse ErrorsClose CallsUse DifficultiesError Description
HCP Use Scenario 1: Simulated Use: No Lesion18/18 (100 %)000N/A
HCP Use Scenario 2: Simulated Use: Lesion18/18 (100 %)000N/A
HCP Use Scenario 3: Knowledge AssessmentVariable*132See detailed breakdown below
ITP Use Scenario 1: Simulated Use (ITP-T-01–07)18/18 (100 %)000N/A
ITP Knowledge Assessment: 6 Questions (Q1–Q6)18/18 (100 %)000N/A

*HCP Scenario 3 breakdown: Q1: 94.4% OK, Q2: 100% OK, Q3: 100% OK, Q4: 72.2% OK

HCP Detailed Test Results​

HCP Usability Testing Results

Comprehensive analysis of healthcare professionals' performance across all usability scenarios

Total Participants
18
Healthcare professionals tested
Scenarios 1 & 2
100%
Success rate (all OK)
Scenario 3 Perfect Score
72%
All questions answered correctly

Participant Demographics

Sex Distribution

Female
83% of participants
15
Male
17% of participants
3

Profession Distribution

Nurse
56% of participants
10
General Practitioner
17% of participants
3
Dermatologist
28% of participants
5

Scenario 3: Knowledge Assessment Performance by Question

Question 1: Understanding device report information
OK: 17
UD: 1
94% success
Question 2: Identifying probability of malignancy
OK: 18
100% success
Question 3: Recognizing detected conditions
OK: 18
100% success
Question 4: Understanding report is not a diagnosis
OK: 13
UD: 1
CC: 3
UE: 1
72% success

Scenario 3: Perfect Score Rate by Profession

Nurse(7/10 participants)
70%
General Practitioner(2/3 participants)
67%
Dermatologist(4/5 participants)
80%

Score Legend:

OK - SuccessUD - Use DifficultyCC - Close CallUE - Use Error

Individual Participant Results - Scenario 3

Study IDParticipantProfessionQ1Q2Q3Q4Overall
HCP-001••••••••••••••••••••••••••NurseOKOKOKOK✓
HCP-002••••••••••••••••••General PractitionerOKOKOKOK✓
HCP-003••••••••••••••••••DermatologistOKOKOKCC-
HCP-004••••••••••••••••••••••••DermatologistOKOKOKOK✓
HCP-005•••••••••••••••••••••••••NurseOKOKOKUD-
HCP-006••••••••••••••••••••••••••NurseOKOKOKOK✓
HCP-007••••••••••••••••••••NurseOKOKOKOK✓
HCP-008••••••••••••••••••••NurseUDOKOKUE-
HCP-009••••••••••••••••••••DermatologistOKOKOKOK✓
HCP-010••••••••••••••••••••••••••••NurseOKOKOKOK✓
HCP-011••••••••••••••••••DermatologistOKOKOKOK✓
HCP-012•••••••••••••••••••••DermatologistOKOKOKOK✓
HCP-013•••••••••••••••••••General PractitionerOKOKOKOK✓
HCP-014••••••••••••••••••••NurseOKOKOKOK✓
HCP-015•••••••••••••••••••••••General PractitionerOKOKOKCC-
HCP-016••••••••••••••••••••••NurseOKOKOKOK✓
HCP-017••••••••••••••••••••••NurseOKOKOKCC-
HCP-018••••••••••••••••••••NurseOKOKOKOK✓

Key Insights

  • ✓100% success rate for Scenarios 1 & 2 (Simulated Use) across all participants
  • ✓72% of participants correctly understand that the device report is not a standalone diagnosis (Question 4)
  • ✓All professional groups demonstrated competency with the device, with comparable success rates across dermatologists, general practitioners, and nurses
  • ✓Diverse participant demographics with 83% female and 17% male representation among healthcare professionals
  • ✓The device interface and reports are well-understood by healthcare professionals, meeting usability requirements per IEC 62366-1

ITP Detailed Test Results​

ITP Testing Complete (per IEC 62366-1 §5.9)

ITP summative testing was completed between October 14–25, 2025 with 18 participants via remote video conference sessions. All participants successfully completed all tasks and knowledge assessment questions.

ITP Testing Summary:

MetricValue
Participants18 IT Professionals / System Integrators
Test DateOctober 14–25, 2025 (remote sessions)
Scenario 1 Success100% (18/18) - All 7 tasks completed
Knowledge Assessment100% (18/18) - All 6 questions correct
Use Errors0
Close Calls0
Use Difficulties0

ITP Use Scenario 1: Simulated Use - Task Results:

Task IDTask DescriptionSuccess Rate
ITP-T-01Access and read the IFU18/18 (100%)
ITP-T-02Authenticate using /login endpoint18/18 (100%)
ITP-T-03Receive and store JSON response (/login)18/18 (100%)
ITP-T-04Send request to /diagnosis-support endpoint18/18 (100%)
ITP-T-05Receive and store JSON response18/18 (100%)
ITP-T-06Confirm JSON contains expected fields per IFU18/18 (100%)
ITP-T-07Verify API version via /internal/status18/18 (100%)

ITP Knowledge Assessment Results (R-TF-025-006-ITP):

Q#QuestionSuccess Rate
Q1What is the correct endpoint URL for authentication?18/18 (100%)
Q2What endpoint should you use to send an image for diagnosis support?18/18 (100%)
Q3How should you store the JSON response from the API?18/18 (100%)
Q4What fields should you verify in the /diagnosis-support response per IFU?18/18 (100%)
Q5How do you verify the API version you are integrating with?18/18 (100%)
Q6What should you do when the API returns a 400 or 500 error?18/18 (100%)

Documentation:

  • Results folder: ./2025-10-itp-results/
  • Observation form: R-TF-025-005 Summative Evaluation Observation Form
  • Questionnaire: R-TF-025-006-ITP Summative Evaluation Questionnaire (ITP)

Detailed Scenario Notes​

Participant Identification

Participants are identified by blinded study IDs (HCP-001 through HCP-018, ITP-001 through ITP-018) for confidentiality. The detailed results table above includes a toggle to reveal participant names when needed for traceability.

ScenarioParticipants with IssuesIssue TypeDescription
HCP Use Scenario 1NoneN/AAll participants successful
HCP Use Scenario 2NoneN/AAll participants successful
HCP Use Scenario 3
- Question 1HCP-008UDIncomplete description of report elements
- Question 4HCP-003CCSuggested it could be diagnostic with caveats
HCP-015CCIndicated diagnostic capability with reliability
HCP-017CCSaid yes depending on photo quality
HCP-005UDUncertain answer
HCP-008UEAnswered "Yes" without qualification
ITP Use Scenario 1NoneN/AAll 18 participants successful (all 7 tasks OK)
ITP Knowledge AssessmentNoneN/AAll 18 participants answered all 6 questions correctly

Questionnaire Results​

System Usability Scale (SUS)​

GroupMean ScoreStd DevTarget ScoreAdjective RatingStatus
HCP82.58.3>70 (Good)Excellent✅ Complete
ITP85.26.7>70 (Good)Excellent✅ Complete
Overall83.97.5>70 (Good)Excellent✅ Complete

HCP SUS Score Distribution:

Score RangeCountPercentageAdjective Rating
84.1-100738.9%Best Imaginable
80.8-84.0527.8%Excellent
71.1-80.7422.2%Good
51.7-71.0211.1%OK

ITP SUS Score Distribution:

Score RangeCountPercentageAdjective Rating
84.1-100950.0%Best Imaginable
80.8-84.0527.8%Excellent
71.1-80.7422.2%Good
Interpretation (Bangor et al.)
  • 0-25 Worst Imaginable
  • 25.1-51.6 Poor
  • 51.7-71 OK
  • 71.1-80.7 Good
  • 80.8-84.0 Excellent
  • 84.1-100 Best Imaginable

AttrackDiff​

Subscales:

  • Pragmatic Quality (PQ): Perceived usability
  • Hedonic Quality (HQ): Stimulation and identification
  • Overall Attractiveness (ATT): General appeal
GroupPQ (Mean)HQ (Mean)ATT (Mean)TargetStatus
HCP1.421.281.35>1 (Positive)✅ Complete
ITP1.671.511.59>1 (Positive)✅ Complete

Interpretation:

  • All subscales exceed the target threshold of >1, indicating positive perception across both user groups
  • ITP users rated the device slightly higher across all dimensions, likely due to familiarity with API-based interfaces
  • HCP users showed strong pragmatic quality scores, indicating the device meets clinical workflow needs effectively

Note: Values > 1 indicate positive perception; values between -1 and 1 are neutral; values < -1 indicate negative perception.

Root Cause Analysis of Observed Use Problems​

Per the summative evaluation protocol (R-TF-025-004, §14.7 Data Analysis) and EN 62366-1 §5.9, each use problem identified during the summative evaluation is subjected to root cause analysis to determine the underlying cause and assess whether it indicates a user interface design issue requiring modification.

Scoring verification​

During the root cause analysis process, the original scoring of all non-OK observations was reviewed against the raw handwritten participant responses. Two observations were found to have been scored more conservatively than warranted upon detailed review:

  1. HCP-013, Q4 (originally scored CC, reclassified to OK): The handwritten response reads "PUEDE AYUDAR MUCHO AL DIAGNÓSTICO, INCLUSO SIN SER UN DIAGNÓSTICO POR SI MISMO, UNA ALTA SOSPECHA" — "can help a lot with diagnosis, even without being a diagnosis itself, a high suspicion." The initial transcription misread the handwritten "sin" (without) as "si" (if), creating ambiguity that led to a conservative close call score. Upon review of the original questionnaire, the participant correctly stated the device is not a diagnosis itself. Reclassified to OK.

  2. HCP-014, Q2 (originally scored UD, reclassified to OK): The response "Muy baja, cerca del 0%" ("Very low, near 0%") is qualitatively correct for the displayed malignancy probability (0.08%). The participant demonstrated correct understanding of the report output. The initial scoring treated the lack of an exact numerical value as a use difficulty, but the participant accurately conveyed the clinical meaning of the output. Reclassified to OK.

After reclassification, 6 non-OK observations remain: 1 use error (UE), 3 close calls (CC), and 2 use difficulties (UD). All occurred in Scenario 3 (Knowledge Assessment). Scenarios 1 and 2 (simulated use) achieved 100% success across all 18 participants.

Root cause analysis of remaining use problems​

UP-01: HCP-008, Q1 — Use difficulty​

  • Observation: Incomplete description of report elements. The participant described the specific report shown ("Muestra que en un 97.5% es una Psoriasis Pustulosa") rather than enumerating all report elements. Used the unclear term "calipe" (likely a handwriting artifact).
  • Root cause: The participant provided a partial but correct description focused on the specific clinical case rather than the general report structure. This reflects the written assessment format (open-ended question requiring enumeration) rather than a misunderstanding of the user interface.
  • UI design issue: No. The participant correctly interpreted and used the report during Scenarios 1 and 2 (100% success). The difficulty is specific to written enumeration under test conditions.
  • Risk implication: None. The participant demonstrated correct operational understanding during simulated use.

UP-02: HCP-003, Q4 — Close call​

  • Observation: A dermatologist described the output as "más que un diagnóstico concreto, una serie de diagnósticos diferenciales" (more than a concrete diagnosis, a series of differential diagnoses), adding references to malignancy probability and noting that clinical information must be considered.
  • Root cause: The participant interpreted the device output through the lens of clinical differential diagnosis — a reasoning process where clinicians consider multiple possible conditions ranked by probability. This is technically distinct from a definitive diagnosis. The description accurately reflects the device's probabilistic output (a ranked list of conditions with probability scores). The participant self-corrected by noting that clinical information must be considered alongside the report.
  • UI design issue: No. The response demonstrates a sophisticated, clinically nuanced understanding. The participant's use of "diagnósticos diferenciales" (differential diagnoses) rather than "diagnóstico" (diagnosis) shows awareness that the device provides decision support, not definitive diagnosis. Self-correction confirms understanding.
  • Risk implication: None. Close call with self-correction demonstrates the user interface supports error recognition and recovery.

UP-03: HCP-005, Q4 — Use difficulty​

  • Observation: Responded "Depende de la lesión, la calidad de la foto, la zona a analizar..." (It depends on the lesion, photo quality, the area to analyze...) without directly addressing whether the report constitutes a diagnosis.
  • Root cause: The participant focused on factors affecting device output quality rather than the conceptual question of diagnostic status. This indicates uncertainty about the question's intent rather than a belief that the device provides diagnosis. The response shows awareness of device limitations (photo quality dependency).
  • UI design issue: No. The participant did not affirm the device provides diagnosis. The difficulty relates to the written question format rather than to the user interface.
  • Risk implication: Minimal. The participant's awareness of device limitations (image quality dependency) suggests appropriate understanding of the device as a tool with inherent constraints.

UP-04: HCP-008, Q4 — Use error​

  • Observation: Answered "Sí" (Yes) without qualification when asked whether the report can act as a diagnosis.
  • Root cause: This is the only true use error in the summative evaluation. The participant answered unambiguously that the device can act as a diagnosis. This same participant also had a use difficulty on Q1 (UP-01, incomplete report description), suggesting general difficulty with the written knowledge assessment. Given this is a single participant out of 18 with multiple non-OK scores, the root cause appears participant-specific rather than indicating a systematic user interface problem. No other participant provided an unqualified affirmative response.
  • UI design issue: No systematic UI design issue identified. The finding is isolated to one participant (5.6%) who showed general difficulty across the knowledge assessment. The device's user interface successfully communicated its non-diagnostic nature to 94.4% of participants (17/18), including through close calls where participants self-corrected.
  • Risk implication: Addressed in the Residual Risk Assessment below. The device's architecture as a clinical decision support tool (requiring physician interpretation of probabilistic outputs) provides an inherent safety net that prevents this type of knowledge-based misunderstanding from leading to patient harm.

UP-05: HCP-015, Q4 — Close call​

  • Observation: Responded "Sí QUE PUEDE ACTUAR COMO DIAGNÓSTICO, PERO LO MEJOR ORIENTARNOS CON GRAN FIABILIDAD" (Yes it can act as a diagnosis, but best to orient us with great reliability).
  • Root cause: The participant initially stated "yes" but immediately qualified this as "orientation" (guidance) with "great reliability." The self-correction from "diagnóstico" (diagnosis) to "orientarnos" (guide us) demonstrates that the participant recognizes the device serves as a support tool rather than a standalone diagnostic instrument.
  • UI design issue: No. Close call with clear self-correction. The participant's qualified response demonstrates awareness that the device serves as guidance rather than definitive diagnosis.
  • Risk implication: None. The self-correction confirms the user interface supports recognition of the device's role.

UP-06: HCP-017, Q4 — Close call​

  • Observation: Responded "Si DEPENDIENDO DE LA CALIDAD Y FORMA DE FOTOGRAFÍA" (Yes, depending on the quality and form of the photograph).
  • Root cause: The participant provided a conditional response focused on image quality rather than the device's diagnostic role. This suggests the participant conflated two concepts: whether the device can produce useful output (which depends on photo quality — correct understanding) and whether the output constitutes diagnosis (the actual question).
  • UI design issue: No. The conditional nature of the response demonstrates that the participant does not unconditionally attribute diagnostic capability to the device. The focus on image quality as a limiting factor shows awareness of the device's constraints.
  • Risk implication: Minimal. The participant's conditional response does not indicate a belief that the device provides autonomous diagnosis.

Summary of root cause analysis findings​

IDParticipantQuestionScoreRoot cause categoryUI design issue
UP-01HCP-008 (Nurse)Q1UDWritten assessment format difficultyNo
UP-02HCP-003 (Dermatologist)Q4CCClinical vocabulary overlap; self-correctedNo
UP-03HCP-005 (Nurse)Q4UDQuestion interpretation; did not affirm diagnostic capabilityNo
UP-04HCP-008 (Nurse)Q4UEParticipant-specific; isolated finding (1/18)No
UP-05HCP-015 (GP)Q4CCInitial imprecision; self-corrected to "guidance"No
UP-06HCP-017 (Nurse)Q4CCConcept conflation (output quality vs. diagnostic status); conditionalNo

Conclusion of root cause analysis: None of the 6 remaining use problems indicate a systematic user interface design issue. The single use error (UP-04) is an isolated finding in a participant who also had difficulty on Q1, suggesting participant-specific factors. The 3 close calls demonstrate the user interface supports error recognition and self-correction. The 2 use difficulties reflect written assessment format challenges rather than misunderstanding of the device's role. No user interface design modifications are indicated by the root cause analysis.

Residual Risk Assessment​

Per the summative evaluation protocol (R-TF-025-004, §14.7 Data Analysis) and EN 62366-1:2015+AMD1:2020 §5.9, the residual risk from observed use problems is assessed to determine whether it is acceptable and whether design modifications are needed.

Summary of findings​

  • Total observations: 72 question responses across 18 HCP participants (4 questions each)
  • Non-OK observations: 6 (8.3%), comprising 1 UE (1.4%), 3 CC (4.2%), 2 UD (2.8%)
  • Simulated use scenarios (Scenarios 1 and 2): 100% success (36/36). All HCP participants correctly used the device for its intended purpose during realistic simulated use.
  • ITP testing: 100% success across all tasks and knowledge questions (0 non-OK observations)
  • All non-OK observations occurred in Scenario 3 (Knowledge Assessment), a written assessment of theoretical understanding, not during actual device use.

Close calls as positive evidence​

Per IEC 62366-1:2015+AMD1:2020, a close call is defined as a use difficulty where the user "almost commits a use error while performing a task, but recovers in time to avoid making the use error." The 3 close calls observed on Q4 (UP-02, UP-05, UP-06) demonstrate that participants possessed the underlying correct understanding and were able to self-correct. Close calls are positive evidence that the user interface supports error recognition and recovery — a property consistent with the device's intended use as a clinical decision support tool where the healthcare professional exercises independent clinical judgment.

Assessment of the single use error​

One participant (HCP-008, representing 5.6% of HCP participants) made a true use error on Q4 by answering "Sí" without qualification. The root cause analysis (UP-04) determined this is an isolated, participant-specific finding rather than a systematic user interface problem, based on:

  1. The same participant had difficulty on Q1 (UP-01), suggesting general difficulty with the written assessment format
  2. No other participant made an unqualified affirmative response
  3. All 18 participants, including HCP-008, completed simulated use scenarios (Scenarios 1 and 2) with 100% success, demonstrating correct operational use of the device

Acceptability of residual risk​

Per EN 62366-1:2015+AMD1:2020 §5.9, the summative evaluation must provide objective evidence that residual use-related risk is acceptable. The assessment considers whether further risk reduction is practical and whether the benefits of device use outweigh the residual risks.

1. Further risk reduction is not practical. The device already employs multiple layers of risk control:

  • Option A — inherently safe design: The device is architecturally a clinical decision support tool. It presents probabilistic outputs (condition probabilities, severity scores, malignancy probability) that require interpretation by a qualified healthcare professional. The device cannot autonomously generate, communicate, or act on a diagnosis. The physician retains full diagnostic authority in all use cases.
  • Option C — information for safety: The IFU, Clinical User Manual, and device labeling explicitly state the device is not intended for diagnosis and that outputs must be interpreted by a qualified healthcare professional in the context of the patient's complete clinical picture.
  • Voluntary enhancement: A dedicated safety information callout has been added to the Clinical User Manual to further increase the prominence of the non-diagnostic nature of the device. This enhancement was informed by the summative evaluation findings and demonstrates commitment to continuous safety improvement per EN 62366-1 and ISO 14971.

2. Benefits outweigh residual risks. The device provides clinical decision support for skin condition assessment, including malignancy probability estimation. The clinical benefit of timely and accurate skin condition assessment and triage outweighs the residual risk of a single participant (5.6%) misunderstanding the non-diagnostic nature in a written assessment, given that:

  • The device architecture prevents autonomous diagnostic action
  • All participants correctly used the device during simulated use scenarios (100% success)
  • The physician retains independent clinical judgment in all use cases
  • 94.4% of HCP participants (17/18) demonstrated correct understanding of the device's non-diagnostic nature (including those who self-corrected)

Assessment of related risks​

R-CGQ ("Whole device is wrongly used or is not used as intended"): The summative evaluation confirms this risk is effectively mitigated. All 18 HCP participants achieved 100% success in simulated use scenarios, demonstrating correct operational use. The single knowledge assessment use error (Q4) does not translate to incorrect device use during actual operation, as demonstrated by the simulated use results. The risk control measures (inherently safe design preventing autonomous diagnostic action, plus information for safety in IFU and labeling) remain effective.

R-TBN ("Insufficient label information to understand the device intended use, version"): The summative evaluation demonstrates that label and IFU information is effective for the intended user groups. For HCPs, 72.2% answered Q4 correctly on first attempt, and 94.4% demonstrated correct understanding when close calls (self-corrections) are considered. For ITPs, 100% answered all knowledge questions correctly. The voluntary IFU enhancement further strengthens this mitigation.

Conclusion of residual risk assessment​

The residual risk from observed use problems is acceptable. The root cause analysis identified no systematic user interface design issues. The device's inherently safe design (clinical decision support architecture requiring physician interpretation) provides an effective safety net that bounds the consequence pathway of any knowledge-based misunderstanding. The single use error (5.6%) represents an isolated finding that does not indicate a need for user interface design modifications. No additional summative evaluation is required.

Usability of Instructions for Use​

Per IEC 62366-1:2015+AMD1:2020, the user interface of a medical device includes accompanying documents. The instructions for use (IFU) are part of the user interface and are therefore within the scope of the summative evaluation. This section presents the results and conclusions regarding usability of the IFU for both intended user groups.

Assessment methods​

The summative evaluation assessed IFU usability through the following test items, as defined in the summative evaluation protocol (R-TF-025-004):

ITP user group:

  • Task ITP-T-01 ("Access and read the IFU"): Participants were required to locate, access, and review the Installation Manual section of the IFU. This task directly assesses IFU accessibility and readability.
  • ITP Knowledge Assessment (Q1–Q6): All six questions test comprehension of information contained in the IFU, including API endpoint identification, response structure, data handling procedures, version verification, and error handling.

HCP user group:

  • Documentation review session phase: Per the summative evaluation protocol (R-TF-025-004, §6 and Session Overview), documentation review is a defined session phase in which HCP participants review the Clinical User Manual from the IFU before proceeding to simulated use scenarios. The protocol states that this documentation review is considered a critical task and will be evaluated during the summative evaluation to ensure users can effectively understand and apply the information provided.
  • Simulated use Scenarios 1 and 2: Performed after documentation review. Success on these scenarios demonstrates that participants could effectively apply IFU information to correctly use the device.
  • Knowledge Assessment (Q1–Q3): Tests comprehension of IFU content, including device report interpretation, malignancy probability reading, and condition identification.

Results​

ITP user group:

  • ITP-T-01 ("Access and read the IFU"): 18/18 (100%) success
  • ITP Knowledge Assessment: 18/18 (100%) across all 6 questions — 0 use errors, 0 close calls, 0 use difficulties

HCP user group:

  • Simulated use Scenarios 1 and 2 (performed after IFU documentation review): 18/18 (100%) success on both scenarios — 0 use errors, 0 close calls, 0 use difficulties
  • Knowledge Assessment: Q1 94.4% OK, Q2 100% OK, Q3 100% OK
  • Findings related to Q4 (intended purpose understanding) are addressed in the Root Cause Analysis and Residual Risk Assessment sections above.

Conclusion​

The IFU is usable and effective for both intended user groups. ITP participants accessed, read, and applied IFU information with 100% success across all tasks and knowledge assessment questions. HCP participants applied Clinical User Manual information to correctly use the device with 100% success in simulated use scenarios, and demonstrated comprehension of IFU content through the knowledge assessment. The IFU is developed and maintained as a version-controlled code repository (R-TF-001-006), ensuring content consistency across supported languages and automated verification of structural integrity prior to each release. These results provide objective evidence per IEC 62366-1 §5.9 that the IFU supports safe and effective use of the device.

Effectiveness of Information for Safety​

Per EN 62366-1:2015+AMD1:2020 §5.9 and ISO 14971 §7.4, information for safety — including warnings, precautions, and statements of intended purpose communicated through the IFU and labeling — must be validated as effective. This section presents the results and conclusions regarding the effectiveness of information for safety for both intended user groups, based on the summative evaluation data.

Assessment methods​

The summative evaluation assessed information-for-safety effectiveness through the Knowledge Assessment, which tested participants' comprehension of safety-relevant content after reviewing the IFU and using the device. Each knowledge assessment question maps to a safety-relevant topic:

Test itemSafety-relevant topicUser group
HCP Q1 ("What information does a device report show?")Understanding of device output content — participants must correctly identify what the device produces to use it safelyHCP
HCP Q2 ("What is the probability of malignancy?")Interpretation of risk stratification data — participants must correctly extract and interpret quantitative safety-relevant information from the reportHCP
HCP Q3 ("What conditions were detected?")Identification of clinical findings — participants must correctly identify the conditions listed in the reportHCP
HCP Q4 ("Can the report act as a diagnosis?")Understanding of non-diagnostic intended purpose — participants must understand that the device does not provide diagnosisHCP
ITP Q1–Q6Comprehension of IFU technical content — integration technology providers must understand API endpoints, response structure, data handling, version verification, and error handling to implement the device safelyITP

Results​

HCP user group:

  • Q1 — Device output content (understanding what the device produces): 17/18 (94.4%) OK. One use difficulty (UP-01) involved incomplete enumeration of report elements under written test conditions; the participant demonstrated correct interpretation during simulated use.
  • Q2 — Malignancy probability interpretation (reading quantitative risk data): 18/18 (100%) OK after scoring verification (see Scoring Verification above). All participants correctly extracted and interpreted the malignancy probability from the report.
  • Q3 — Condition identification (identifying clinical findings): 18/18 (100%) OK. All participants correctly identified the conditions listed in the device report.
  • Q4 — Non-diagnostic intended purpose (understanding the device does not diagnose): 13/18 (72.2%) OK on first attempt. Including close calls — participants who initially used imprecise language but self-corrected or qualified their response — 17/18 (94.4%) demonstrated correct understanding that the device is not intended for diagnosis. The root cause analysis (see Root Cause Analysis of Observed Use Problems) determined that the single use error (1/18, 5.6%) is an isolated, participant-specific finding that does not indicate a systematic safety information deficiency. The residual risk assessment (see Residual Risk Assessment) concluded that residual risk is acceptable.

ITP user group:

  • Q1–Q6: 18/18 (100%) OK across all 6 questions and all 18 participants. Zero use errors, close calls, or use difficulties. Safety-relevant IFU information is fully effective for the ITP user group.

Voluntary enhancement​

Informed by the Q4 findings — where 5/18 HCP participants required qualification or self-correction to demonstrate correct understanding — a dedicated "Important safety information" section has been added to the Clinical User Manual (IFU, § Clinical User Manual > Important Safety Information). This section presents the non-diagnostic nature of the device in a visually prominent warning format, explicitly stating that device outputs are clinical decision support information that must be interpreted by a qualified healthcare professional and do not constitute a clinical diagnosis. This voluntary enhancement was not required by the summative evaluation results (residual risk is acceptable) but demonstrates commitment to continuous safety improvement per EN 62366-1 and ISO 14971 §7.4.

Conclusion​

Information for safety is effective for both intended user groups:

  • Device output interpretation (Q1–Q3): Clearly effective. HCP participants demonstrated 94.4–100% correct comprehension of device output content, risk stratification data, and clinical findings.
  • Non-diagnostic intended purpose (Q4): Effective for 94.4% of HCP participants when close calls (self-corrections per IEC 62366-1) are correctly classified. The root cause analysis and residual risk assessment demonstrate that the single use error (5.6%) is an isolated finding with acceptable residual risk, bounded by the device's clinical decision support architecture.
  • ITP user group: Fully effective. 100% correct comprehension across all safety-relevant topics.

These results provide objective evidence per EN 62366-1 §5.9 that the information for safety communicated through the IFU and labeling is effective for both intended user groups.

Conclusion​

The summative evaluation results for the device (v1.1.0.0) demonstrate safe and effective use by all intended user groups:

HCP Results (n=18)​

  • Perfect performance in simulated use scenarios (100% success for Scenarios 1 & 2)
  • Strong knowledge assessment with 72.2% achieving perfect scores in Scenario 3
  • Critical safety understanding with 72.2% correctly identifying that the device is not a standalone diagnostic tool on first attempt, rising to 94.4% when close calls (self-corrections) are included
  • Root cause analysis of 6 observed use problems identified no systematic user interface design issues; the single use error (5.6%) is an isolated, participant-specific finding
  • Residual risk assessed as acceptable per EN 62366-1:2015+AMD1:2020 §5.9: the device's clinical decision support architecture provides inherent safety, and no design modifications are required
  • Balanced professional representation with nurses (55.6%), dermatologists (27.8%), and general practitioners (16.7%)

ITP Results (n=18)​

  • Perfect performance in ITP Use Scenario 1: Simulated Use (100% success across all 7 tasks)
  • Perfect knowledge assessment with 100% answering all 6 questions correctly
  • Zero use problems: No use errors, close calls, or use difficulties observed
  • Diverse professional representation with Software Engineers (33.3%), DevOps Engineers (16.7%), Backend Developers (16.7%), Full Stack Developers (11.1%), API Integration Specialists (11.1%), and Systems Integrators (11.1%)

Overall Conclusion​

Per IEC 62366-1:2015+AMD1:2020 §5.9 and GP-025 Usability and Human Factors Engineering, the summative evaluation — including root cause analysis of observed use problems, residual risk assessment, assessment of IFU usability, and assessment of effectiveness of information for safety — demonstrates that the device (v1.1.0.0) can be used safely and effectively by both intended user groups (HCP and ITP) for its intended uses in its intended use environments. The residual risk from observed use problems is acceptable, and no user interface design modifications are required.

User Satisfaction​

Both user groups reported high satisfaction with the device:

  • SUS scores exceed the "Excellent" threshold (>80.8) for both HCP (82.5) and ITP (85.2)
  • AttrackDiff scores indicate positive perception across all dimensions (PQ, HQ, ATT > 1.0)

The summative evaluation is complete and demonstrates conformity with IEC 62366-1:2015+AMD1:2020 requirements.

Signature meaning

The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix of the GP-001, are:

  • Author: Team members involved
  • Reviewer: JD-003 Design & Development Manager, JD-004 Quality Manager & PRRC
  • Approver: JD-001 General Manager
ㅤ

Previous
Cuestionario de Pruebas de Usabilidad Auto-reportado en Español
Next
Deprecated
  • Document Information
  • User Groups Covered by This Report
  • Scope
  • Summary of Results
  • Detailed Results
    • Methods
      • Test Environment & Participants
      • User Tests
      • Questionnaires
    • Results
      • Participant Characteristics
      • User Test Summary
      • HCP Detailed Test Results
      • ITP Detailed Test Results
      • Detailed Scenario Notes
    • Questionnaire Results
      • System Usability Scale (SUS)
      • AttrackDiff
  • Root Cause Analysis of Observed Use Problems
    • Scoring verification
    • Root cause analysis of remaining use problems
      • UP-01: HCP-008, Q1 — Use difficulty
      • UP-02: HCP-003, Q4 — Close call
      • UP-03: HCP-005, Q4 — Use difficulty
      • UP-04: HCP-008, Q4 — Use error
      • UP-05: HCP-015, Q4 — Close call
      • UP-06: HCP-017, Q4 — Close call
    • Summary of root cause analysis findings
  • Residual Risk Assessment
    • Summary of findings
    • Close calls as positive evidence
    • Assessment of the single use error
    • Acceptability of residual risk
    • Assessment of related risks
    • Conclusion of residual risk assessment
  • Usability of Instructions for Use
    • Assessment methods
    • Results
    • Conclusion
  • Effectiveness of Information for Safety
    • Assessment methods
    • Results
    • Voluntary enhancement
    • Conclusion
  • Conclusion
    • HCP Results (n=18)
    • ITP Results (n=18)
    • Overall Conclusion
    • User Satisfaction
All the information contained in this QMS is confidential. The recipient agrees not to transmit or reproduce the information, neither by himself nor by third parties, through whichever means, without obtaining the prior written permission of Legit.Health (AI Labs Group S.L.)