R-TF-012-007 Formative evaluation plan_2023_001

Purpose

The formative evaluation plan, together with the summative evaluation plan documented in R-TF-012-014, outlines the structured approach for the usability evaluation of our medical device software, focusing on the user interface. It forms an integral part of the Usability Engineering Process per EN 62366-1, ensuring a comprehensive evaluation of usability throughout the design and development phases.

The goals are:

To systematically integrate usability considerations into the software design process.
To ensure that the software user interface is understandable and operable for the intended users.
To align with and document the usability engineering activities as per EN 62366-1 standards.

It's a process parallel to the design and development that is managed in the Design History File but focused on the user interface. The objective is to achieve an understandable and operable software user interface for the intended users during the design and development stages.

Usability engineering process integration

This plan is part of the Usability engineering process, which is summarised in the following chart.

The process begins with the initiation of the usability engineering process and progresses through critical steps such as the identification of user interface elements, the creation of the usability engineering file, and both formative and summative evaluation planning and execution. It incorporates various standards like ISO 2859-4:2020 and ISO 3951-2:2013 for evaluation justification. The process concludes with the finalization of key usability documents, ensuring a comprehensive approach to usability in our medical device software development.

This Plan is a key component of our Usability Engineering Process, guiding the evaluation activities and ensuring compliance with relevant standards.
It complements the Usability Report, providing a roadmap for achieving usability objectives and documenting the process to ensure a user-friendly and effective software interface.

Responsibilities

JD-003

To coordinate the entire design and development process.

JD-005

To ensure that the entire process of usability evaluation is carried out according to the methodology established in the present plan.

JD-004

To ensure that the usability evaluation is performed following this plan and that all the records required are properly generated, reviewed, approved and archived accordingly.

Terms and Definitions

DHF: Design History File
Formative evaluation: user interface evaluation conducted with the intent to explore user interface design strengths, weaknesses, and unanticipated use errors.
Hazard-related use scenario: use scenario that could lead to a hazardous situation or harm.
Primary operating function: the function that involves user interaction that is related to the safety of the medical device.
QMS: Quality Management System.
Summative evaluation: user interface evaluation conducted at the end of the user interface development with the intent to obtain objective evidence that the user interface can be used safely.
Usability: characteristic of the user interface that facilitates use and thereby establishes effectiveness, efficiency and user satisfaction in the intended use environment.
Use environment: actual conditions and settings in which users interact with the medical device.
Use error: user action or lack of user action while using the medical device that leads to a different result than that intended by the manufacturer or expected by the user.
Use scenario: a specific sequence of tasks performed by a specific user in a specific use environment and any resulting response of the medical device.
Use or specification: summary of the important characteristics related to the context of use of the medical device.
User interface: means by which the user and the medical device interact.
User interface evaluation: the process by which the manufacturer explores or assesses the user interactions with the user interface.
User interface specification: a collection of specifications that comprehensively and prospectively describe the user interface of a medical device.

General information

The actions aimed at demonstrating the usability of the device are carried out following the standard UNE-EN 62366-1:2015/A1:2020 and the results are registered in the corresponding usability reports R-TF-012-008 Formative evaluation report and R-TF-012-015 Summative evaluation report.

The associated reports are included in the corresponding technical file and they must be updated when the main version of the medical device software changes during the product lifecycle leading to a new usability test.

Inputs

Risk management (according to procedure GP-013 Risk management). If we identify any new risks related to usability in the R-TF-013-002 Risk management record, we must create a new version of the R-TF-012-015 Summative evaluation report, collecting all this data, and generate a new strategy to comply with usability following the standard UNE-EN 62366-1:2015/A1:2020.
Feedback and complaints (according to procedure GP-014 Feedback and complaints) to look for new requirements that may require a product redesign related to usability.
The information generated by the performance of the post-market activities according to the GP-007 Post-market surveillance procedure, will be considered to identify possible interferences related to the usability of the medical device.
Any communication detected during the vigilance activities described at the GP-004 Vigilance system that can affect the usability of the product.
The clinical evaluation, which was performed according to the GP-015 Clinical evaluation procedure.
When the results require actions to prevent affecting the usability of the device, the guidelines indicated in the General Procedure `GP-006 Non-conformity. Corrective and preventive actions will be used to determine, design, and document the actions.

Outputs

To register the results of the formative evaluation, we create a report using the template R-TF-012-008 Formative evaluation report.
To register the results of the summative evaluation, we create a report using the template R-TF-012-015 Summative evaluation report.
A new entry will be generated in the R-TF-002-004 Annual management review report by the corresponding management referring to these changes in the QMS documentation.

Formative evaluation plan

This section details our approach to the formative evaluation of the software, focusing on the identification and rectification of potential usability problems in early development stages, in line with EN 62366-1, 5.7.2.

Purpose

The purpose of the formative evaluation plan is to identify potential usability problems and rectify them before the summative evaluation phase.

Characterization of the medical device

Intended purpose: Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.
Intended user: The intended users are healthcare organisations. More concretely, health care professionals (HCP) and, for the formative evaluation, the technical team or IT department of healthcare organisations.
Patient population: The device is intended for use on images of skin from patients presenting visible skin structure abnormalities, across all age groups, skin types, and demographics.
Operating principle: The device is a computational medical tool leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures.
Body structure: epidermis, its appendages (hair, hair follicle, sebaceous glands, apocrine sweat gland apparatus, eccrine sweat gland apparatus and nails) and associated mucous membranes (conjunctival, oral and genital), the dermis, the cutaneous vasculature and the subcutaneous tissue (subcutis).
Indications: the device is intended for assessing skin structures (dermatoses).

Use specification

Target user group

This section addresses the intended user interaction with our medical device, and the intended user's profile and it justifies the focus of our formative evaluation exclusively on IT technicians, as opposed to healthcare professionals or patients.

Given the technical nature of the device and its mode of operation as an API, the formative evaluation is appropriately focused on IT technicians. This decision aligns with the intended use of the device and ensures that our usability efforts are concentrated on the most relevant user group, thereby maximizing the impact and relevance of our usability findings.

Nature of user interaction

Device usage: Our device functions as an API that is accessed programmatically. It is not designed for direct interaction by healthcare professionals (HCPs) or patients.
HCP interaction: Healthcare professionals benefit indirectly from the device's functionalities. However, they do not interact directly with the API. Instead, their interaction occurs through integrated systems like EHR or EMR, which are managed by IT professionals.
Patient interaction: Patients are not direct users of the device. Their care may be influenced by the device's outputs, but they have no direct engagement with the API.

Justification for target user group

IT technicians as primary interface users: the users of our device who interact with the interface are IT professionals who possess the technical and programming skills necessary to integrate and manage the API within healthcare systems.
Advanced technical requirements: Direct interaction with the API requires advanced technical and programming knowledge, which is beyond the typical skill set of HCPs.
Role of HCPs and patients: While HCPs and patients are important stakeholders, their interaction with the device is mediated through systems managed by IT professionals, not through direct use of the API itself.
Focus of usability testing: Consequently, our formative evaluation focuses on IT technicians who are responsible for integrating and managing the device within healthcare systems. This approach ensures that the device is usable and effective for its primary user group, contributing to the overall efficiency and effectiveness of healthcare delivery.

User profile

Intended users are software programmers, that work in the healthcare field and will have these technical proficiencies:

Programming Languages: The API follows REST principles, and knowledge of any web development language such as JavaScript, Python, or Java could be essential. These languages are commonly used to interact with REST APIs due to their extensive support for HTTP request methods.
Technologies:
- RESTful API Standards: Users should be familiar with RESTful design principles and how to consume APIs using standard HTTP methods (GET, POST, PUT, DELETE).
- JSON: Knowledge of data formats like JSON for handling API responses is crucial since REST APIs typically communicate in this format.
Tools:
- Development Environments: Familiarity with integrated development environments (IDEs) like Visual Studio Code, Eclipse, or others that facilitate API development and testing.
- Version Control Systems: Understanding of version control systems like Git, which are essential for managing changes to the API's codebase, especially in team environments.
- API Testing Tools: Proficiency with tools like Postman or Swagger could be beneficial. These tools are used for testing API endpoints, understanding the API structure, and generating documentation.
Security Practices: Knowledge of basic security practices for API integration, such as handling authentication (e.g., API keys), ensuring data encryption, and safeguarding against common vulnerabilities (SQL injection, cross-site scripting).
Compliance and Standards: Familiarity with healthcare-related IT standards and regulations (e.g., FHIR, HIPAA) that might impact how the API is implemented and used, especially in terms of data handling and patient privacy.
Experience: at least 5 years of experience as a programmer in projects that involve consuming HTTP APIs.
Understand the outputs provided by the API.

Use environment

The device is intended to be used in the setting of healthcare organisations and their IT departments, which commonly are situated inside hospitals or other clinical facilities, although not always.

The formative evaluation will be performed in the healthcare organisations' offices where the intended users (IT practitioners) usually execute their work.

It is expected that users will set aside specific time to use the product in an environment with minimal noise and distractions.

Identification of the elements of the user interface

The user interface of the device comprises 3 elements:

API Endpoints: URL structures, expected HTTP methods, and status codes.
API Documentation: Descriptive guides, use cases, sample requests, and responses.
Data Structure: JSON payload formats, including field names and FHIR nomenclature.

These are the testable design and technical requirements of the user interface of the device.

What is the "interface" of an API?

When testing the usability of an API-centric medical device such as ours, the "interface" comprises its endpoints, documentation, error messages, and data formats. Usability for such systems often hinges on the clarity of the documentation, the intuitiveness of the API design, and the robustness of the system in handling various data scenarios. This usability plan will ensure that all these aspects are considered and evaluated.

Methodology

The methodology for the formative evaluation plan is two-fold:

Scenario-based evaluation: utilizing specific scenarios (Integration, Data Transaction, Error Handling) to assess various aspects of usability.
User-centered approach: involving actual or representative users in the evaluation process.

According to the usability standard EN 62366-1, it is recommended to perform formative evaluation iteratively so that we can identify user interaction problems and implement effective solutions before the summative evaluation.

We define the number of iterations of the formative evaluation based on the results of the first formative evaluation. More specifically, if the results of the first formative evaluation will provide satisfactory results according to the established criteria, no further iteration will be performed and we will proceed with the summative evaluation.

The formative evaluation will be performed with the legacy device (API integration only) since it is the prototype of the current device.

The formative evaluation will be performed remotely by the users defined above and there will be no recordings of the sessions.

We will send the evaluation questionnaire, which includes instructions on the tasks, to the selected users and we will analyze the results once all the users have completed the tasks and filled in the evaluation questionnaire.

The results of the formative evaluation will be documented, evaluated and discussed in the R-TF-012-008 Formative evaluation report.

Scenarios

We consider three different scenarios for the formative evaluation test, and evaluate accordingly:

Integration: measure the success and the blocking issues faced during a typical third-party integration.
- Success integrating
- Ease of integration
Data transaction: test the process of sending and receiving data, evaluating the accuracy and reliability of the transmission.
- Success in sending data
- Success in receiving data
- Full integrity of the data
Error Handling: intentionally induce errors to assess how well they're communicated and how easily they can be rectified.
- Data Missing error shows relevant error message
- Access denied error shows relevant error message
- Image quality error shows relevant error message

Integration vs. integration

Keep in mind that, in this context, integration tests are not referring to the same thing as the record R-TF-012-006 Lifecycle plan and report. In that context, Integration testing refers to checking if the new code contains errors or if it clashes with existing code.

Here, in the R-TF-012-007 Formative evaluation plan, the word integration refers to the fact that the device is meant to be integrated into other software, such as EHR and EMRs. As such, the integration test in the context of usability refers to the test regarding our customers' efforts of integration.

Evaluation questionnaire

This section lays out the content of the questionnaire. As you can see, the questionnaire includes the explanations, questions and tasks that the tester must carry out:

Confirm access to IFU
- Confirm that you have access to the instructions for use and that you have reviewed them. If you don't have access, please reach out to the person who sent you this form. If you have the instructions but have not reviewed them, please review them now.
  - Yes, I confirm that I have access and that I have reviewed the instructions for use
Instructions
- To test the usability of the device, follow these steps:
  1. Create an environment similar to the environment in which you would integrate the device. You can use a virtual machine, a docker container or simply your localhost.
  2. Write a script to send data to the API, and to get back the response. This should be a very minimal and simple function.
  3. Call the function and see if it works.
  4. If you run into an issue, please review your code to make sure that you did not make a mistake.
- This should take you between 30 and 90 minutes
Scenario 1: Integration
- Objective:
  - Evaluate the success and potential issues faced during third-party integration
- Questions:
  - Did you succeed at integrating the device into a system?
    - Yes
    - No
  - Ease of integration: rate ease of integration from 1 to 5 (1 being very difficult and 5 being very easy)
    - 1 (Very difficult)
    - 2
    - 3
    - 4
    - 5 (Very easy)
  - Comments (optional): provide any comments or explanations about Success Integrating.
    - Open text field
  - Was the IFU understandable and complete to support you in this use case?
    - Yes
    - No
Scenario 2: Data transaction
- Objective:
  - Test the sending and receiving of data, and evaluate the accuracy and reliability of the transmission
- Questions:
  - Success Sending Data: Did you succeed at sending data to the device? Rate the success of sending data from 1 to 5 (1 being unsuccessful and 5 being completely successful)
    - 1 (Unsuccessful)
    - 2
    - 3
    - 4
    - 5 (Completely successful)
  - Provide any comments or explanations about Success of sending data
    - Open text field
  - Success Receiving Data: Did you succeed at receiving data from the device? Rate the success of receiving data from 1 to 5 (1 being unsuccessful and 5 being completely successful)
    - 1 (Unsuccessful)
    - 2
    - 3
    - 4
    - 5 (Completely successful)
  - Provide any comments or explanations about Success of receiving data
    - Open text field
  - Integrity of data. By integrity, we mean the lack of data loss or corruption. Was the data correct? Rate integrity of data from 1 to 5 (1 being corrupted or lost and 5 being full data integrity)
    - 1 (All data was corrupted or lost)
    - 2
    - 3
    - 4
    - 5 (Full data integrity)
  - Was the IFU understandable and complete to support you in this use case?
    - Yes
    - No
Scenario 3: Error handling
- Objective:
  - Intentionally induce errors to assess how well they are communicated and how easily they can be rectified.
- Instructions:
  - Now, you must do something that may feel a little bit strange: we want you to break the integration. More specifically, we want you to replicate 3 situations:
    1. Send a request to the device where some basic data is missing. For example, don't send an image in the request.
    2. Send a request to the device without adding the API key.
    3. Send a request to the device with a very bad image. For instance, an image that is 50x50 pixels, or that is all black.
- Questions:
  - Data Missing Error: Did the error message explain that the issue was that some data was missing?
    - Yes
    - No
  - Provide any comments or explanations about Data Missing Error
  - Open text field
  - Access Denied Error: Did the error message explain that the issue was that the access was denied?
    - Yes
    - No
  - Provide any comments or explanations about Access Denied Error
  - Open text field
  - Poor Quality Image Error: Did the error message explain that the issue was that the image was of very poor quality?
    - Yes
    - No
  - Provide any comments or explanations about Poor Quality Image Error
  - Open text field
  - Was the IFU understandable and complete to support you in this use case?
    - Yes
    - No
Overall feedback:
- Please provide any additional comments or suggestions that you think would help improve the usability of our medical device.
  - Open text field

Privacy notice

The questionnaire also includes a privacy notice informing testers that we will process their data to collect information. The notice also explains that they can exercise their rights under GDPR and further review our privacy policy.

Evaluation criteria

The evaluation utilizes a structured form embedded in a web-based platform for ease of data collection and analysis.

Criteria for success are clearly defined for each scenario, with a focus on ease of use, error handling, and data integrity.

The minimal success rates expected for the evaluation are listed below:

#	Scenario	Range from	Range to	Min. Avg. Score
1	Success integrating	0	1	1
2	Ease of integration	1	5	4
3	Success sending data	1	5	4
4	Success receiving data	1	5	4
5	Full integrity of the data	1	5	5
6	'Data Missing' error shows relevant message	0	1	1
7	'Access denied' error shows relevant message	0	1	1
8	'Quality Image' error shows relevant message	0	1	1

Sampling

The process of determining the sample size for usability testing of our device involves a methodical approach, grounded in statistical principles and tailored to the specific context of our device's usage. This section outlines the steps taken to arrive at the appropriate sample size.

Steps of the sampling process

Understanding the User Base
- Identification of Target Users: Our target user base comprises IT professionals in healthcare organizations with experience in integrating APIs into EHR and EMR systems.
- Skill Level Assessment: Recognizing the high level of technical expertise among potential users, which influences the likelihood of quickly identifying usability issues.
Reference to Standards
- ISO 2859-4:2020 and ISO 3951-2:2013: Adapting principles from both standards to suit the usability testing context of a software API.
Risk Assessment
- Evaluating Impact of Non-Conformities: Considering the potential impact that usability issues could have on the integration and operation of medical systems.
Practical Considerations
- Resource Constraints: Acknowledging limitations in time and budget that influence the feasibility of conducting large-scale usability testing.
- Efficiency vs. Effectiveness: Balancing the need for a comprehensive assessment to conduct the testing process efficiently.
Statistical Rationale
- Diminishing Returns in Usability Testing: Based on usability testing research, understanding that a smaller number of users typically identify the majority of significant issues.
- Qualitative over Quantitative: Focusing on in-depth, qualitative feedback rather than a broad, quantitative approach.
Sampling Methodology
- Purposive Sampling: Selecting participants who match the end-user profile to ensure relevant and effective feedback.
- Representation: Ensuring the sample represents a diverse cross-section of the user base in terms of expertise and organizational context.

Sample size

The result for the minimum size for the usability testing of the device is 3 participants, following a non-random, purposive sampling method. This is a balanced decision considering the expertise of the user base, the principles of ISO 2859-4:2020, the qualitative nature of the testing, and practical constraints.

Justification based on ISO 2859-4:2020

ISO 2859-4:2020 offers guidance on determining sample sizes for various quality inspections. While this standard is primarily used for product inspections, its principles can be adapted for usability testing. This standard advocates for a risk-based approach where the sample size is chosen considering the potential impact of non-conformities.

In our case, the impact of usability issues can be significant, affecting the integration and operation of medical systems. However, since the users are highly skilled IT professionals, the risk of misunderstanding or misusing the API is lower than it would be with a general user base.

Justification based on ISO 3951-2:2013

Applying ISO 3951-2:2013 tables, by setting a low AQL and considering the high skill level of the participants, the ISO 3951-2:2013 tables indicate that a small sample size is sufficient for each set of functionalities.

When considering the API as a whole, with multiple sets of functionalities, a cumulative sample size of 3 participants emerges as sufficient. This is because each participant tests multiple "lots", and their combined testing covers the entire API.

What are "lots" for us?

Each "lot" in our context represents a distinct set of functionality or API endpoints to be tested. While ISO 3951-2:2013 is typically used for physical product inspections, we adapt its principles for software usability.

This approach considers the high skill level of the users, the qualitative nature of the feedback, the criticality of the device, and the low AQL. The result is a focused and effective testing process that aligns with both the practical considerations of software development and the rigorous standards of quality inspection.

Practicality in Context

The choice of a small sample size, such as 3, is also driven by practical considerations:

Expertise of Users: The test participants are not general users but professionals with technical expertise in API integration. Their high level of skill and experience means that fewer participants are needed to identify the majority of usability issues.
Qualitative Nature of Feedback: Usability testing in this context is qualitative rather than quantitative. A few experienced users can provide in-depth insights that are more valuable than surface-level feedback from a larger group.
Resource Constraints: Considering time and budget limitations, a smaller sample allows for a more focused and manageable testing process.

Statistical Considerations

From a statistical perspective, while a larger sample size can provide more data, diminishing returns are observed in usability testing. Research in usability testing methodologies indicates that the first few users (typically around 3 to 5) tend to identify the majority of significant issues. Beyond this, new users often report the same issues, offering diminishing value.

Sampling Design

The sampling of the population for the verification, validation, usability, bench, and clinical performance tests of the device has been meticulously designed:

Target Population Identification: The target population consists of IT professionals in healthcare organizations who are experienced in integrating APIs into EHR and EMR systems.
Sampling Method: The selection is purposive, targeting professionals who directly match the end users' profiles. This approach ensures relevance and effectiveness in identifying usability issues.
Representation: The sample, while small, represents a cross-section of the potential user base, including variations in expertise and organizational contexts.

Application of statistical methods

Non-Random Sampling: Given the specialized nature of the user base, a non-random, purposive sampling method is used.
Feedback Analysis: Qualitative analysis is employed to deeply understand the nuances of each feedback, rather than purely quantitative metrics.

Record signature meaning

Author: JD-004
Reviewer: JD-003
Approver: JD-005

Purpose​

Usability engineering process integration​

Responsibilities​

JD-003​

JD-005​

JD-004​

Terms and Definitions​

General information​

Inputs​

Outputs​

Formative evaluation plan​

Purpose​

Characterization of the medical device​

Use specification​

Target user group​

Nature of user interaction​

Justification for target user group​

User profile​

Use environment​

Identification of the elements of the user interface​

Methodology​

Scenarios​

Evaluation questionnaire​

Evaluation criteria​

Sampling​

Steps of the sampling process​

Sample size​

Justification based on ISO 2859-4:2020​

Justification based on ISO 3951-2:2013​

Practicality in Context​

Statistical Considerations​

Sampling Design​

Application of statistical methods​

Record signature meaning

Purpose

Usability engineering process integration

Responsibilities

JD-003

JD-005

JD-004

Terms and Definitions

General information

Inputs

Outputs

Formative evaluation plan

Purpose

Characterization of the medical device

Use specification

Target user group

Nature of user interaction

Justification for target user group

User profile

Use environment

Identification of the elements of the user interface

Methodology

Scenarios

Evaluation questionnaire

Evaluation criteria

Sampling

Steps of the sampling process

Sample size

Justification based on ISO 2859-4:2020

Justification based on ISO 3951-2:2013

Practicality in Context

Statistical Considerations

Sampling Design

Application of statistical methods