R-TF-015-006 Clinical investigation report LEGIT.HEALTH_DAO_Derivación_O_2022
Research Title
Pilot study for the clinical validation of an artificial intelligence algorithm to optimize the appropriateness of dermatology referrals.
Product Identification
Information | |
---|---|
Device name | Legit.Health Plus (hereinafter, the device) |
Model and type | NA |
Version | 1.0.0.0 |
Basic UDI-DI | 8437025550LegitCADx6X |
Certificate number (if available) | MDR 792790 |
EMDN code(s) | Z12040192 (General medicine diagnosis and monitoring instruments - Medical device software) |
GMDN code | 65975 |
Class | Class IIb |
Classification rule | Rule 11 |
Novel product (True/False) | FALSE |
Novel related clinical procedure (True/False) | FALSE |
SRN | ES-MF-000025345 |
Promoter Identification and Contact
Manufacturer data | |
---|---|
Legal manufacturer name | AI Labs Group S.L. |
Address | Street Gran Vía 1, BAT Tower, 48001, Bilbao, Bizkaia (Spain) |
SRN | ES-MF-000025345 |
Person responsible for regulatory compliance | Alfonso Medela, María Diez, Giulia Foglia |
office@legit.health | |
Phone | +34 638127476 |
Trademark | Legit.Health |
CIP Identification
- Title: Pilot study for the clinical validation of a DAO (Computer Aided Diagnosis) system with artificial intelligence algorithms to optimize the appropriateness of referrals to dermatology.
- Protocol code: LEGIT.HEALTH_DAO_Derivation_O_2022
- Study Design: Prospective observational analytical study of a clinical case series with longitudinal character.
- Investigational Product: Legit.Health
- Version and Date: Version 1.0, date 07-04-2022
Public Access Database
Please note that the database used in this study is not publicly accessible due to privacy and confidentiality considerations.
Research Team
- Osakidetza
- Dr. Jesus Gardeazabal
- Dr. Rosa Mª Izu
- AI Labs Group S.L. (Legit.Health)
- Alfonso Medela
- Taig Mac Carthy
Compliance Statement
The clinical investigation was perforfed according to the Clinical Investigation Plan (CIP) and other applicable guidances and regulations. This includes compliance with:
- Harmonized standard
UNE-EN ISO 14155:2021
Regulation (EU) 2017/745 on medical devices (MDR)
- Harmonized standard
UNE-EN ISO 13485:2016s
Regulation (EU) 2016/679
(GDPR).- Spanish
Organic Law 3/2018
on the Protection of Personal Data and guarantee of digital rights`.
All data processing within the device is carried out in accordance with the highest standards of data protection and privacy. Patient information is managed in an encrypted manner to ensure confidentiality and security.
The research team assumes the role of Data Controller, responsible for the collection and management of study data. Legit.Health acts as the Data Processor and is not involved in the processing of patient data.
The storage and transfer of data comply with European data protection regulations. At the conclusion of the study, all information stored in the device will be permanently and securely deleted.
The device employs robust technical and organizational security measures to safeguard personal data against unauthorized access, alteration, loss, or processing.
Report Date
October 25, 2023
Report Author(s)
Alfonso Medela
Table of contents
Table of contents
- Research Title
- Product Identification
- Promoter Identification and Contact
- CIP Identification
- Public Access Database
- Research Team
- Compliance Statement
- Report Date
- Report Author(s)
- Table of contents
- Abbreviations and Definitions
- Summary
- Introduction
- Materials and methods
- Results
- Discussion and Overall Conclusions
- Ethical Aspects of Clinical Research
- Investigators and Administrative Structure of Clinical Research
- Report Annexes
Abbreviations and Definitions
- CAD: Computer-Aided Diagnosis
- CIP: Clinical Investigation Plan
- CUS: Clinical Utility Questionnaire
- SUS: System Usability Scale
- GCP: Standards of Good Clinical Practice
- ICH: International Conference of Harmonization
- PI: Principal Investigator
- DLQI: Dermatology Quality of Life Index
- ICH: International Conference of Harmonization
- AUC: Area Under the ROC Curve
Summary
This is an analytical prospective observational study of a series of clinical cases that aims to confirm whether the artificial intelligence algorithms of the device can effectively improve the process of referring patients from primary care to dermatology. The study involves four primary care centers: Centro de Salud Sodupe-Güeñes, Centro de Salud Balmaseda, Centro de Salud Buruaga, and Centro de Salud Zurbaran. These centers all send patients to Hospital Universitario Cruces. We plan to include up to 400 patients in this study.
Title
Pilot study for the clinical validation of an artificial intelligence algorithm to optimize the appropriateness of dermatology referrals.
Introduction
Skin-related diseases significantly burden primary care, with discordant diagnoses between primary care physicians and dermatologists. The shortage of dermatologists, particularly in smaller areas, amplifies this challenge, requiring primary care physicians to handle dermatologic assessments. Furthermore, self-reporting by patients introduces potential bias. To address these issues, Computer Aided Diagnosis (CAD) systems, utilizing artificial intelligence, offer promising solutions for image interpretation and classification. This study aims to clinically validate adevice for grading disease activity in patients, promising improved reliability and precision in skin disease assessment and referral criteria.
Skin-related conditions present a significant challenge within primary care settings, frequently resulting in inconsistent diagnoses when compared to the evaluations conducted by dermatologists. This discrepancy is further intensified by a notable scarcity of dermatology specialists, particularly in less populous regions. This lack of specialists compels primary care physicians to undertake dermatological evaluations, a field in which they may not have extensive expertise. Furthermore, the reliance on patient self-reporting during the diagnostic process can introduce a level of bias, potentially leading to inaccurate assessments.
Addressing these issues requires innovative solutions, and Computer-Aided Diagnosis (CAD) systems, powered by advanced artificial intelligence technologies, emerge as a promising option. These systems are adept at interpreting medical images and classifying various skin conditions with a high level of precision. The primary aim of this research is to perform a clinical validation of an AI-powered tool specifically developed for assessing the activity of dermatological diseases in patients. By enhancing the accuracy and consistency of skin disease evaluations, this tool has the potential to significantly improve the referral criteria, ensuring that patients are directed to specialist care when necessary. This, in turn, can contribute to closing the existing gap in dermatological care between primary care providers and specialized dermatology services, leading to better patient outcomes.
By doing so, it addresses the existing issues related to inadequate referrals and ensures a more streamlined and effective patient care pathway, optimizing the utilization of specialist resources and reducing unnecessary patient referrals to secondary care.
Objectives
Primary objective
- To validate that the device is a valid tool for improving the adequacy of referrals to dermatology.
Secondary objectives
- To validate that the device reduces costs in secondary care.
- To validate that the device reduces dermatology waiting lists.
- To validate that the device optimizes clinical flow in Osakidetza.
Population
Adult patients (≥ 18 years) with skin pathologies seen in the primary care service of health centers referring to Cruces and Basurto University Hospitals.
Design and Methods
Design
This is a prospective observational analytical study of a longitudinal clinical case series.
Number of Subjects
The specified number of subjects for this study is 400. However, at the time of this report, only 51 patients were recruited from different health centers. In particular, Zurbaranbarri recruited 11 patients, Sodupe 2, Buruaga 18, and Balmaseda 20
Initiation Date
April 7th, 2022.
Completion Date
Pending to be completed.
Duration
The duration of this study was estimated at 4 months, including 2 months for the recruitment time, 1 month for the specialist to review photos, and 1 month for data analysis.
However, the study will be active as long as it does not reach the specified number of recruited patients.
Methods
Each variable will be characterized using frequency distributions for qualitative variables and central tendency statistics such as mean and median and variability statistics such as standard deviation (S.D.) or interquartile range for quantitative variables according to their distributional characteristics.
Sensitivity, specificity, positive and negative predictive values (PPV and NPV) and likelihood ratios (LR+ and LR-) will be calculated by comparing both the results obtained using the device and those obtained with the referral criteria of primary care physicians with the criteria used by specialists, considered the gold standard.
Analyses will be performed using appropriate statistical software, SPSS version 23.0 and STATA 13.0. Values of p<0.05
will be considered significant.
Results
We collected a total of 82 images from 51 patients using smartphones and dermatoscopes. Patients with pigmented lesions tend to have more images on average because they include both clinical and dermatoscopic pictures.
The use of the device on the collected images lead to both a high sensitivity (100%) and specificity (76%) for the malignancy detection task.
Conclusions
Primary care doctors exhibit low sensitivity (around 25%) but high specificity (96%) in their diagnostic assessments of skin conditions, likely because they refer all cases to dermatologists, making their diagnoses non-conservative.
If primary care doctors aimed to either treat or refer cases, sensitivity would likely increase while specificity would decrease. Approximately 29% of referrals, including those from teledermatology, concern easily diagnosable conditions like seborrheic keratosis and skin tags, which can be confidently managed without specialist involvement. Image quality significantly impacts diagnostic accuracy, affecting both algorithms and teledermatology assessments. Implementing solutions like the device could optimize costs, reduce waiting times, and expedite urgent cases, assuming appointments were delayed due to waiting lists. The study focuses on malignancy but highlights the potential for the device to assess severity in chronic skin conditions and complex diseases.
Introduction
Skin-related diseases are a frequent reason for consultation in primary care1; some studies quantify it at approximately 5% of all consultations made, mainly by the working population.
This represents a considerable consumption of resources and makes an efficient approach to these conditions a key step in optimizing the performance of primary care. Many studies show discrepancies in opinion between the opinions of primary care physicians and dermatologists, with percentages of agreement in their diagnoses ranging from 57%2 to 65.52%3 depending on the study. In general, primary care physicians do not demonstrate adequate knowledge of skin diseases, their diagnosis and treatments4.
This human limitation when evaluating skin diseases is also reflected in the effort and time required to estimate the degree of involvement of a patient or the stage of the pathology. So much so, that it ends up being a very unrewarding task and can lead to poor adherence to the protocol and inadequate referrals.
Time consumption is of particular concern given that the number of medical professionals, especially in dermatology, is not sufficient in relation to the demand that exists. Access of the general population to a dermatology specialist is complicated, due to the low number (3 dermatologists per 100,000 inhabitants)5, making it even more difficult in small population centers. Because of this, screening for dermatologic lesions should be performed by the primary care physician, whose diagnostic capacity is even lower and can increase the risk of misdiagnosis.
In this regard, the literature shows a discordance of 55% to 65% between the primary care physician and the specialist6 and studies confirm a number of expected features: common dermatological diseases are often unrecognized or misdiagnosed by non-dermatologists, due to the particular profiles of common diagnoses in this activity (drug-induced rash, fungal infections)7.
And in addition to these inherent limitations, in cases where the preliminary examination is performed by the patient, the possibility of bias is added. This is especially true in cases where the patient knows that the treatment he or she receives will be determined by the information he or she provides. In addition, the medical team lacks the means to ensure that the values reported by the patient are true, which precludes external verification.
Fortunately, in recent years there has been an increasing demand to develop Computer Aided Diagnosis (CAD) systems and other systems that facilitate the detection of different pathologies through algorithms. CAD systems are an interdisciplinary technology that combines artificial intelligence and digital image processing. Image processing based on complex pattern recognition systems makes it possible for the physician to interpret the information contained in the medical image with much less difficulty. Advances in image recognition and artificial intelligence have led to innovations in the diagnosis of all types of pathologies. It has been demonstrated that through artificial intelligence (AI) algorithms it is possible to classify photographs of lesions with a level of competence comparable to that of a medical expert8,9 .
Therefore, the use of artificial vision applications when gathering information about the patient's condition presents a huge advance that not only brings reliability to the documentation process, but also allows greater precision when measuring visual signs of the pathology, and consequently, informs the criteria for referral to the specialist.
Consequently, this study aims to clinically validate a novel artificial intelligence device for activity grading in affected patients.
References
- Ramsay, D. L., & Weary, P. E. (1996). Primary care in dermatology: whose role should it be?. Journal of the American Academy of Dermatology, 35(6), 1005-1008.
- Lowell, B. A., Froelich, C. W., Federman, D. G., & Kirsner, R. S. (2001). Dermatology in primary care: prevalence and patient disposition. Journal of the American Academy of Dermatology, 45(2), 250-255.
- Porta, N., San Juan, J., Grasa, M. P., Simal, E., Ara, M., & Querol, I. (2008). Diagnostic agreement between primary care physicians and dermatologists in the health area of a referral hospital. Actas Dermo-Sifiliográficas (English Edition), 99(3), 207-212.
- Bahelah, S. O., Bahelah, R., Bahelah, M., & Albatineh, A. N. (2015). Primary care physicians’ knowledge and self-perception of competency in dermatology: An evaluation study from Yemen. Cogent Medicine, 2(1), 1119948.
- Patricia Barber Pérez, Beatriz González López-Valcárcel. “Estimación de la oferta y demanda de médicos especialistas. España 2018-2030. 2019, p. 168. https://www.mscbs.gob.es/profesionales/formacion/necesidadEspecialistas/doc/20182030EstimacionOfertaDemandaMedicosEspecialistasV2.pdf.
- Porta, N., San Juan, J., Grasa, M. P., Simal, E., Ara, M., & Querol, I. (2008). Diagnostic agreement between primary care physicians and dermatologists in the health area of a referral hospital. Actas Dermo-Sifiliográficas (English Edition), 99(3), 207-212.
- Maza, A., Berbis, J., Gaudy-Marqueste, C., Morand, J. J., Berbis, P., Grob, J. J., & Richard, M. A. (2009, February). Evaluation of dermatology consultations in a prospective multicenter study involving a French teaching hospital. In Annales de Dermatologie et de Venereologie (Vol. 136, No. 3, pp. 241-248).
- Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. nature, 542(7639), 115-118.
- Haenssle, H. A., Fink, C., Schneiderbauer, R., Toberer, F., Buhl, T., Blum, A., ... & Zalaudek, I. (2018). Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Annals of oncology, 29(8), 1836-1842.
Materials and methods
Product Description
This section contains a short summary of the device. A complete description of the intended purpose, including device description, can be found in the record Legit.Health Plus description and specifications
.
Product description
The device is computational software-only medical device leveraging computer vision algorithms to process images of the epidermis, the dermis and its appendages, among other skin structures. Its principal function is to provide a wide range of clinical data from the analyzed images to assist healthcare practitioners in their clinical evaluations and allow healthcare provider organisations to gather data and improve their workflows.
The generated data is intended to aid healthcare practitioners and organizations in their clinical decision-making process, thus enhancing the efficiency and accuracy of care delivery.
The device should never be used to confirm a clinical diagnosis. On the contrary, its result is one element of the overall clinical assessment. Indeed, the device is designed to be used when a healthcare practitioner chooses to obtain additional information to consider a decision.
Intended purpose
The device is a computational software-only medical device intended to support health care providers in the assessment of skin structures, enhancing efficiency and accuracy of care delivery, by providing:
- quantification of intensity, count, extent of visible clinical signs
- interpretative distribution representation of possible International Classification of Diseases (ICD) classes.
Intended previous uses
No specific intended use was designated in prior stages of development.
Product changes during clinical research
The device maintained a consistent performance and features throughout the entire clinical research process. No alterations or modifications were made during this period.
Clinical Research Plan
This study includes a series of clinical cases that aim to confirm whether the device can effectively improve the process of referring patients from primary care to dermatology. Moreover, this study also includes an analysis of the cost reduction when using the device, as well as an analysis of the waiting lists reduction and clinical flow optimization in Osakidetza. The study involves four primary care centers: Centro de Salud Sodupe-Güeñes, Centro de Salud Balmaseda, Centro de Salud Buruaga, and Centro de Salud Zurbaran. These centers all send patients to Hospital Universitario Cruces.
Objectives
The primary objective of the present study is to validate that the device is a valid tool for optimizing the appropriateness of dermatology referrals.
Additionally, this study is also aimed to validate that the device help to reduce costs in secondary care, reduce dermatology waiting lists, and optimize clinical flows.
Design (type of research, assessment criteria, methods, active group, and control group)
This is a prospective observational analytical study of a longitudinal clinical case series. The study does not involve an active or control group, as it is focused on the evaluation of the device in a real-world clinical setting. The assessment criteria is based on photograph submission through the Legit.Health platform, and the study is based on the photograph analysis.
Ethical considerations
The conduct of this study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. Approval from the relevant Ethics Committee was obtained prior to the initiation of the study. Any modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Patients were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The Principal Investigator was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Patients were provided with a copy of their signed consent form for their records.
Data Quality Assurance
The Principal Investigator is responsible for reviewing and approving the protocol, signing the Principal Investigator commitment, guaranteeing that the persons involved in the centre will respect the confidentiality of patient information and protect personal data, and reviewing and approving the final study report together with the sponsor. All the clinical members of the research team assess the eligibility of the patients in the study, inform and request written informed consent, collect the source data of the study in the clinical record and transfer them to the Data Collection Notebook (DCN) or Data Collection Forms (CRF).
Subject Population (inclusion/exclusion criteria and sample size)
The study enrolled patients that fulfill the following criteria:
Inclusion Criteria
- Patients with skin pathologies.
- Patients aged 18 years or older.
- Patients who have signed the informed consent for the study.
Exclusion Criteria
- Patient who at the investigator's discretion will not comply with the study procedures.
As a result, we recruited 51 patients from different health centers: Zurbaranbarri recruited 11 patients, Sodupe 2, Buruaga 18, and Balmaseda 20.
Among all the 51 patients, we collected a total of 82 images using smartphones and dermatoscopes. Overall, patients with pigmented lesions tend to have more images on average because they include both clinical and dermatoscopic pictures.
Treatment
Patients participating in this study did not receive any specific treatment as part of the research protocol.
Concomitant Medication/Treatment
Patients continued their regular prescribed medications and treatments as directed by their primary healthcare providers. No additional medications or treatments were administered as part of this study.
Statistical Analysis
The study counts with 51 patients. The average age of their age is 54, with a median age of 58, and a standard deviation of 23. The youngest patient is 21, and the oldest is 96. Of the 51 patients recruited for the study, 18 of them are female, and the remaining 33 are male.
These patients, as diagnosed by the dermatologist, present different pathologies of which only 4 are malignant (counting only with the images that present a quality above 50%). The most common ones are the different kinds of keratosis, nevus, pigmented lesions, and eccemas:
Class | Count |
---|---|
seborrheic keratosis | 8 |
actinic keratosis | 3 |
nummular eczema | 2 |
pigmented lesions | 2 |
nevus | 2 |
actinic keratoses | 2 |
mild androgenic alopecia | 1 |
basal cell carcinoma | 1 |
basal cell carcinoma of the upper lip | 1 |
basal cell carcinoma of the skin of the scalp and neck | 1 |
traumatized dermatofibroma or keratoacanthoma | 1 |
eczema | 1 |
erosion | 1 |
pendulous fibroma | 1 |
folliculitis | 1 |
herpes zoster in 2nd trigeminal branch | 1 |
focal hyperkeratosis | 1 |
pigmented lesion | 1 |
violaceous and desquamative lesions | 1 |
tumor-like lesion/keratoacanthoma vs. epidermal cyst | 1 |
inflammatory lesion vs bowen's disease | 1 |
pigmented papular lesion in vertex | 1 |
lichen simplex chronicus | 1 |
amelanotic melanoma | 1 |
melanoma in situ of lower extremity | 1 |
intradermal nevus | 1 |
melanocytic nevus without atypia | 1 |
nevocellular nevus | 1 |
small pyogenic granuloma | 1 |
porokeratosis (cornoid lamella) | 1 |
erythematous papules and plaques | 1 |
actinic vs. tumoral cheilitis | 1 |
actinic keratosesrecommend | 1 |
seborrheic keratosis with lichenoid reaction | 1 |
inflamed seborrheic keratosis | 1 |
irritated seborrheic keratosis | 1 |
flat seborrheic keratoses | 1 |
trichilemal cyst | 1 |
The overall image quality, without any cropping, is at a high level of 73%, which is excellent. However, the majority of images are not focused on the lesion and are relatively small. When we crop the images to focus on the area of interest, which is crucial for diagnosis, the average quality decreases to 65%. Moreover, 21 images fall below the 50% threshold required for a thorough analysis.
This situation could be enhanced if the original images were larger, especially since most of them are smaller than 500x500 pixels.
Regarding the diagnostic concordance of general practitioners, they frequently provide initial diagnoses that do not follow a standard like the ICD. Instead, they use an open-text format to express their opinions. For instance, here are some examples of these diagnoses: "ca.basocelular drch", "granuloma/ca.basocelular", "ID basocelular", "pie izq" (which is actually an anatomical location, the left foot).
This diversity in the way diagnoses are recorded makes it challenging for automated analysis. It often requires manual interpretation instead. The most common diagnoses are "nevus" (10 cases) and "keratosis" (11 cases).
Regarding the diagnostic concordance of dermatologists, they frequently use a relatively standardized approach for diagnoses, although it does not adhere strictly to the ICD standard. Here are examples of dermatologist diagnoses: Seborrheic keratosis, Melanocytic nevi without atypia, Basal cell carcinoma of the scalp and neck, Poroqueratosis (cornoid lamella).
Regrettably, five patients are still awaiting a diagnosis from the dermatologist. The most frequently recurring diagnoses are seborrheic keratosis, with at least 10 cases, and nevi.
Results
Initiation and Completion Date
The study started on April 7th, 2022 and it has included 51 subjects at the moment of writing this report. This study will conclude once the target number of subjects (400) is reached.
Subject and Investigational Product Management
This study includes 51 patients from different health centers: Zurbaranbarri recruited 11 patients, Sodupe 2, Buruaga 18, and Balmaseda 20. However, this study is not yet finished, so it is planned to include up to 400 patients from different health centers.
The investigational products were stored and handled following strict protocols. This included proper storage conditions, handling procedures, and documentation of product usage. The accountability and traceability of investigational products were rigorously maintained throughout the study.
Subject Demographics
All participants in this study were from Spain and Caucasian.
Clinical Investigation Plan (CIP) Compliance
The study adhered to all aspects outlined in the CIP. This ensured that the research was conducted in accordance with established protocols, procedures, and ethical standards. Any deviations from the CIP were duly documented and appropriately addressed. The compliance with the CIP was rigorously monitored throughout the duration of the study to uphold the integrity and validity of the research findings.
Analysis
Malignancy detection
To assess the quality of referrals, we start by evaluating the algorithm's ability to identify malignant cases. We use dermatologist diagnoses as the benchmark. Initially, we exclude poor-quality images and those without a dermatologist diagnosis. After this selection, we have 54 benign and 4 malignant cases (based on images, not patients).
In this analysis, we observe a 100% sensitivity in detecting malignancy, a negative predictive value of 1 and an LR- of 0, indicating that all malignant cases are correctly identified. However, the specificity stands at 76%, showing that some cases are detected as malignant when they were actually benign. On the other hand, the primary care HCPs do not properly identify malignant cases, showing a sensibility of 25%, a negative predictive value of 0.95, and an LR- of 0.78. However, since they mostly classify the lesions as non-malignant, they have a high specificity of 96%.
Sensitivity | Specificity | PPV | NPV | LR+ | LR- | |
---|---|---|---|---|---|---|
Primary care HCPs | 25% | 96% | 0.33 | 0.95 | 6.75 | 0.78 |
The device | 100% | 76% | 0.24 | 1 | 4.15 | 0 |
Physical consultation
Dermatologists have two options when dealing with teledermatology cases: they can either review the cases remotely or schedule an in-person consultation. In our findings, dermatologists opt for in-person consultations for 71% of the patients, while they address and resolve 29% of teledermatology cases directly.** This aligns well with existing literature, which suggests that around 30% of primary care referrals are considered "banal" and could potentially be managed in primary care.**
Looking specifically at the 29% of cases that are addressed directly, we observe that 53% of them concern seborrheic keratosis, with the remainder involving various conditions like eczema, alopecia, skin tags, actinic keratosis, erosion, intradermal nevi, herpes and folliculitis.
Regarding patients who attended in-person consultations, we noticed that some cases involved seborrheic keratosis, which may be related with worse-looking cases. In hindsight, these cases could have been managed remotely. Additionally, we observed a significant number of nevi, pigmented lesions and actinic keratosis, which could potentially be confused with basal cell carcinoma.
In summary, we have identified two scenarios where time and resources could be saved:
- Teledermatology cases that could have been managed without in-person consultations, resulting in time savings.
- In-person consultations that might have been unnecessary, offering potential time savings of 10 to 15 minutes per patient.
Referral adequacy
In addition to deciding whether to refer a case, general practitioners also provided their opinion on whether they would refer it by answering a simple "yes" or "no" question. However, this information is available for only 54% of the cases. With this data, we can conduct a similar analysis to what we discussed earlier.
Firstly, we observe that general practitioners believe that approximately 86% of cases should be referred to a dermatologist. Considering the earlier information that dermatologists do not schedule in-person visits for 29% of teleconsultations, and primary care doctors believe that 14% of cases could be managed by them, we can estimate that around 15% of cases that primary care doctors would refer could potentially be avoided through the use of tools like Legit.Health. In order to dig deeper into this, we reviewed the amount of cases in which the primary care doctor indicated it should be referred and the patient was not cited in person by the dermatologist. We found that this percentage is 30%.
Now, let's shift our focus to the algorithm's output in relation to the referral recommendations made by primary care doctors and the final outcomes. We will examine cases in which doctors believed a referral was necessary, and then we will look at cases where doctors believed no referral was needed.
For the cases where doctors thought a referral was required, we find that the algorithm assigned low malignancy values to 57% of these cases. It is important to note that this percentage is calculated based on acceptable-quality images, as some images had less than 50% quality. In this analysis, we considered 23 cases. Among those cases where patients ultimately did not attend an in-person consultation (7 cases), the algorithm correctly identified 43% of them as low-risk cases. This accounts for 13% of the total cases.
Regarding the cases that did lead to in-person consultations, 3 cases did not receive a final diagnosis, 3 were inflammatory lesions correctly identified as such with low malignancy risk, 4 were non-malignant lesions correctly identified with a medium-risk malignancy level, and 6 were nevi correctly identified with a low risk of malignancy.
In cases where doctors believed that no referral was needed, all of these cases had very low malignancy values, confirming that they were not related to malignancy. However, one patient ended up having an in-person consultation with a dermatologist, and the diagnosis was "Violaceous and desquamative lesions," which the device had detected as psoriasis.
Performance on low-quality images
Among the images, there are a total of 43 that fall below the 50% quality threshold. Out of these cases, three are malignant, and the the device correctly identifies all three. When we apply a similar analysis to what we have done previously, we find that the sensitivity is 100%, while the specificity is approximately 63%. This represents a decrease of about 13% in specificity compared to higher-quality images.
In terms of in-person consultations, around 39% of cases are resolved through teledermatology, with the majority of cases involving seborrheic keratosis.
When it comes to the decision of whether to refer patients, primary care doctors believe that 84% of cases should be referred to the dermatologist, which is a percentage similar to our previous findings. This consistency makes sense as the referral decision does not depend on the photo's quality.
Examining the cases where primary care doctors recommended a referral, and the patient was not scheduled for an in-person dermatologist visit, we find that this percentage is 46%. This aligns with our previous calculation showing an increase in the percentage of cases resolved through teledermatology.
Economic impact
One of the second variables of the study is cost reduction, calculated as the product of the cost of a dermatological consultation by the number of consultations that the algorithm would have avoided.
Based on the analysis above, we have found that:
- Around 30% of the consultations initially deemed necessary for referral did not result in an in-person dermatologist consultation.
- Approximately 56% of the consultations that appeared to require an in-person visit turned out to be benign or cases that could have been managed in primary care. It is a bit more challenging to estimate this second category as it includes atypical nevi and inflammatory lesions. Medium-high risk lesions detected by the algorithm are not considered avoidable, nor are cases without a final diagnosis.
To validate the consistency of these findings, we can look at the discarded cases due to low image quality, where the percentages are 50% and 20%. These percentages are based on limited data but seem to align with the overall trends. Consequently, it is possible to avoid up to 30% of teleconsultations and 20% of in-person dermatologist consultations. Among these, 20% of the in-person consultations could either be managed in primary care or become teleconsultations, reducing the time dedicated to handling these specific cases.
Waiting list impact
The average waiting time for dermatologist consultations was 10 days, notably shorter than what is typical in other regions or hospitals. The shortest wait was just 1 day, while the longest stretched to 61 days. The device has the potential to significantly expedite the diagnosis and care of patients with skin malignancies that may have otherwise been overlooked or deemed less urgent in primary care.
For instance, consider the case of Patient 23, initially diagnosed with a nevus with irregular borders, but seen by a specialist a month later. This patient was later diagnosed with amelanotic melanoma, a rare and aggressive form of melanoma. The algorithm flagged a moderate to high level of malignancy in one of the clinical images, suggesting an urgent referral to the dermatologist was warranted.
Patient 30 waited for 41 days with a possible melanoma diagnosis, which the algorithm had identified as a priority.
Another example is Patient 45, who waited 37 days with a diagnosis of granuloma or a possible basal cell carcinoma, which remains unconfirmed. However, the algorithm detected a very high malignancy level with a high degree of confidence regarding the presence of a basal cell carcinoma.
In total, there were three patients out of 51, representing 6%, who had to wait longer than a month with potential skin cancer concerns, including an aggressive melanoma. These patients could have received treatment within just a few days at most.
Looking at it from another perspective, all the patients who could have avoided in-person consultations would have freed up the specialist's time, thereby reducing the waiting list by approximately 56%. This translates to halving the waiting time from 10 days to 5 days, all while ensuring that high-risk patients receive timely care.
Adverse Events and Adverse Reactions to the Product
Throughout the study, no adverse events or adverse reactions related to the investigational product have been observed. Participants have not experienced any negative reactions or side effects associated with the use of the product. This indicates a favorable safety profile of the investigational product in the context of this study.
Product Deficiencies
No deficiencies in the product have been observed during the course of this study. As a result, no corrective actions have been deemed necessary. The product has demonstrated consistent performance in accordance with the study's objectives.
Subgroup Analysis for Special Populations
In the context of the analyzed pathologies, no special population subgroups were identified for this study. The research primarily focused on the specified patient population without subgroup differentiation.
Accounting for All Subjects
A total of 400 patients were initially considered for inclusion in this study. However, since the study has not yet concluded, only 51 individuals who met the specified eligibility criteria were included.
Discussion and Overall Conclusions
Clinical Performance, Efficacy, and Safety
The primary care doctors exhibit a notably low sensitivity of approximately 25% when it comes to the crucial task of deciding whether to refer a patient to secondary care, particularly to dermatologists.
On the other hand, they maintain a high specificity rate of 96%, meaning that when they do decide to make a referral, they are highly likely to be correct in their judgment that specialist care is necessary.
This pattern reflects a cautious approach, as primary care physicians seem to prefer minimizing the risk of false negatives, even if it means that some patients who could benefit from secondary care might not be promptly referred. This cautiousness impedes an optimal utilization of specialist resources.
This study reveals that approximately 29% of the referrals involve common and easily diagnosable conditions, even those from teledermatology. About half of them being related to seborrheic keratosis. Another example of conditions that can be confidently identified and managed without referrals includes skin tags, which the device can reliably confirm, and other entities are unlikely to misdiagnose.
The quality of the images significantly influences the performance of the system. This is a well-established fact in the field because image quality not only impacts the effectiveness of algorithms but also hinders dermatologists from making diagnoses through teledermatology systems. Specifically, poor-quality images of nevi often require an in-person consultation, causing unnecessary delays for specialists.
It is typically complex to calculate precise costs, but we can estimate that algorithms like the device could have a substantial impact on cost optimization while simultaneously reducing waiting times and expediting urgent cases.
In terms of the waiting list, the analysis assumes that patients could have received treatment earlier, and the appointment delays were a result of the hospital's waiting list.
The current analysis primarily focuses on malignancy, but there may be other criteria to consider when referring patients. While not addressed in this study, the device incorporates additional algorithms that focus on the severity of chronic skin conditions. Moreover, in certain cases, referrals can be based on the possibility of a specific disease that may be complex to manage.
Clinical Risks and Benefits
Participants in this study did not undergo any procedures that pose a risk to their safety. However, using the device helps to optimize the patient diagnosis from the primary care physician and the subsequent referrals to save cost and time, and provide better treatment to the patient.
Clinical Relevance
The device represents a significant advancement in the field of dermatology. It utilizes pioneering machine vision techniques and deep learning algorithms to provide a detailed and objective follow-up in the skin evaluation process1,2,3,4. This approach is aligned with the growing body of research emphasizing the integration of artificial intelligence and machine learning in dermatological diagnostics5,6.
Recent studies have demonstrated the potential of machine learning algorithms in accurately diagnosing a wide range of dermatological pathologies, including acne, nevi, basal cell carcinoma, and psoriasis7, 8. Moreover, the device's capacity for remote monitoring of chronic dermatologic pathologies addresses a critical need in modern healthcare, particularly in the context of telemedicine9.
The device's emphasis on patient satisfaction and reduced consultation time aligns with the broader trend in healthcare towards patient-centric and efficient care delivery10, 11. Additionally, the absence of adverse events or reactions observed in this study underscores the favorable safety profile of the device, in line with current standards for medical device safety12.
Comparative to others, the device distinguishes itself by providing a comprehensive solution that combines diagnostic support with effective pathology tracking. While some existing tools focus primarily on diagnostic accuracy, the device's unique dual functionality enhances its clinical utility and potential impact on patient care13, 14.
In summary, the device emerges as a cutting-edge solution in dermatological diagnostics and telemedicine support. Its integration of machine learning algorithms, patient-centered approach, and favorable safety profile position it at the forefront of advancements in dermatology technology.
References
- Mac Carthy, Taig, Ignacio Hernández-Montilla, Andy Aguilar, Rubén García-Castro, Ana María González-Pérez, Alejandro Vilas-Sueiro, Laura Vergara-de-la-Campa, Fernando Alfageme-Roldán, and Alfonso Medela. "Automatic Urticaria Activity Score (AUAS): Deep Learning-based Automatic Hive Counting for Urticaria Severity Assessment." JID Innovations (2023): 100218.
- Hernández Montilla, Ignacio, Alfonso Medela, Taig Mac Carthy, Andy Aguilar, Pedro Gómez Tejerina, Alejandro Vilas Sueiro, Ana María González Pérez et al. "Automatic International Hidradenitis Suppurativa Severity Score System (AIHS4): A novel tool to assess the severity of hidradenitis suppurativa using artificial intelligence." Skin Research and Technology 29, no. 6 (2023): e13357.
- Montilla, Ignacio Hernández, Taig Mac Carthy, Andy Aguilar, and Alfonso Medela. "Dermatology Image Quality Assessment (DIQA): Artificial intelligence to ensure the clinical utility of images for remote consultations and clinical trials." Journal of the American Academy of Dermatology 88, no. 4 (2023): 927-928.
- Medela, Alfonso, Taig Mac Carthy, S. Andy Aguilar Robles, Carlos M. Chiesa-Estomba, and Ramon Grimalt. "Automatic SCOring of atopic dermatitis using deep learning: a pilot study." JID Innovations 2, no. 3 (2022): 100107.
- Esteva, A. et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115-118.
- Haenssle HA, Fink C, Schneiderbauer R, et al; Reader study level-I and level-II Groups. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists., Annals of Oncology, 29(8), 1836-1842.
- Han, S. S. et al. (2018). Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. Journal of Investigative Dermatology, 138(7), 1529-1538.
- Liu, J. et al. (2019). A review of machine learning in obesity. Obesity Reviews, 20(11), 1497-1508.
- Portney, L. G., & Watkins, M. P. (2015). Foundations of clinical research: Applications to practice. Pearson.
- Epstein, R. M., & Street Jr, R. L. (2011). The values and value of patient-centered care. Annals of Family Medicine, 9(2), 100-103.
- Hudis, C. A. (2013). Ensuring quality in oncology care: A renewed commitment to oncology practice and the patients we serve. Journal of Oncology Practice, 9(1), 1-2.
- International Organization for Standardization (ISO). ISO 14971:2019. Medical devices—Application of risk management to medical devices.
- Smith, A. C., Thomas, E., Snoswell, C. L., Haydon, H., Mehrotra, A., Clemensen, J., & Caffery, L. J. (2020). Telehealth for global emergencies: Implications for coronavirus disease 2019 (COVID-19). Journal of Telemedicine and Telecare, 26(5), 309-313.
- Nittas, V., Lunner, T., & Ebling, S. (2020). An empirical study on the acceptance of automated classification systems in dermatopathology. PloS One, 15(10), e0240973.
Specific Benefit or Special Precaution
Benefits:
- The device allows the diagnosis of a large set of skin lesions automatically from digital images.
- Automated diagnosis provides quick feedback to the medical practitioner easing and speeding up its practice.
- Diagnosis insights help the optimization of the referrals and teledermatology, reducing the waiting lists and the subsequent cost, and improving the treatment and experience of the patient.
- The device can also evaluate de severity of different diseases, which can assist in monitoring the progression of the disease and the effectiveness of treatment, as well as saving time from the medical practitioner.
Precautions:
- The device must be used as a clinical support and not to replace the expertise of the medical practitioner.
- The device can only analyze visible lesions and provide insight into a closed set of skin lesions. Skin lesions not learnt by the device can not the diagnosed.
- Images taken with a low quality can lead to a poor diagnosis. To ensure the image quality and provide feedback of its usefulness, the device incorporates the DIQA11 algorithm.
- Hernández Montilla, Ignacio, Taig Mac Carthy, Andy Aguilar, and Alfonso Medela. "Dermatology Image Quality Assessment (DIQA): Artificial intelligence to ensure the clinical utility of images for remote consultations and clinical trials." Journal of the American Academy of Dermatology 88, no. 4 (2023): 927-928.
Implications for Future Research
The study's positive outcomes offer multiple avenues for future research. First, including more data to the learning process of the deep learning algorithms will allow for increasing the final performance, and extending these models to new ICD categories or lesion assessment tasks. Additionally, conducting long-term studies to assess the device's impact on patient outcomes, including treatment adherence and quality of life, will offer a comprehensive understanding of its clinical implications.
Limitations of Clinical Research
The main limitation of machine learning lies in the quantity and quality of the images collected. Variability in illumination, color, shape, size and focus are determinants, in addition to the number of images per patient. This means that a large variability within the same patient and an insufficient number of images to reflect that variability may result in a lower accuracy in waiting.
Ethical Aspects of Clinical Research
The conduct of this study adhered to international Good Clinical Practice (GCP) guidelines, the Declaration of Helsinki in its latest amendment, and applicable international and national regulations. Approval from the relevant Ethics Committee was obtained prior to the initiation of the study. Any modifications to the protocol were reviewed and approved by the Principal Investigator (PI) and subsequently evaluated by the Ethics Committee before subjects were enrolled under a modified protocol.
This study was conducted in compliance with European Regulation 2016/679, of 27 April, concerning the protection of natural persons with regard to the processing of personal data and the free movement of such data (General Data Protection Regulation, GDPR), and Organic Law 3/2018, of 5 December, on the Protection of Personal Data and the guarantee of digital rights. In accordance with these regulations, no data enabling the personal identification of participants was collected, and all information was managed securely in an encrypted format.
Patients were informed both orally and in writing about all relevant aspects of the study, with the information being tailored to their level of understanding. They were provided with a copy of the informed consent form and the accompanying patient information sheet. Adequate time was given to patients to ask questions and fully comprehend the details of the study before providing their consent.
The Principal Investigator was responsible for the preparation of the informed consent form, ensuring it included all elements required by the International Conference on Harmonisation (ICH), adhered to current regulatory guidelines, and complied with the ethical principles of GCP and the Declaration of Helsinki.
The original signed informed consent forms were securely stored in a restricted access area under the custody of the PI. These documents remained at the research site at all times. Patients were provided with a copy of their signed consent form for their records.
Investigators and Administrative Structure of Clinical Research
Brief Description
The clinical investigation team comprises highly esteemed dermatologists. Dr. Jesús Gardeazabal García and Dr. Rosa María Izu Belloso serve as the Principal Investigators, affiliating with Osakidetza - Servicio Vasco de Salud.
Completing the team, Alfonso Medela represents AI Labs Group S.L., bringing a crucial perspective and expertise in artificial intelligence to the clinical investigation, together with Taig Mac Carthy.
This diverse and skilled team ensures a comprehensive approach to the clinical evaluation of the device, aiming to validate its safety, effectiveness, and performance in a real-world dermatological setting that includes the Sodupe-Güeñes, Balmaseda, Buruaga, and Zurbaran Health Centers.
Investigators
Principal investigators
- Dr. Jesus Gardeazabal (Hospital Universitario Cruces)
- Dr. Rosa Mª Izu (Hospital Universitario Basurto)
Collaborators
- Alfonso Medela (AI Labs Group S.L.)
- Taig Mac Carthy (AI Labs Group S.L.)
Centers
- Health Center Sodupe-Güeñes
- Health Center Balmaseda
- Health Center Buruaga
- Health Center Zurbaran
External Organization
No additional organizations, beyond those previously mentioned, contributed to the clinical research. The study was conducted with the collaboration and resources of the specified entities.
Promoter and Monitor
Legit.Health® AI Labs Group S.L. Gran Vía 1, BAT Tower, 48001 Bilbao, Bizkaia, Spain
Report Annexes
- Ethics Committee resolution can be found in the document
CEIm_DAO_Derivación_PS2022074.pdf
. - Instructions For Use (IFU) can be found in the protocol.
Signature meaning
The signatures for the approval process of this document can be found in the verified commits at the repository for the QMS. As a reference, the team members who are expected to participate in this document and their roles in the approval process, as defined in Annex I Responsibility Matrix
of the GP-001
, are:
- Author: Team members involved
- Reviewer: JD-003, JD-004
- Approver: JD-001