Researchers in Korea have developed a deep learning-based artificial intelligence (AI) algorithm that can accurately classify cutaneous skin disorders, predict malignancy, suggest primary treatment options, and serve as an ancillary tool to enhance the diagnostic accuracy of clinicians. With the assistance of this system, the diagnostic accuracy of dermatologists as well as the general public was significantly improved. This novel study is reported in the Journal of Investigative Dermatology.
Skin diseases are common, but it is not always easy to visit a dermatologist quickly or distinguish malignant from benign conditions. “Recently, there have been remarkable advances in the use of AI in medicine. For specific problems, such as distinguishing between melanoma and nevi, AI has shown results comparable to those of human dermatologists. However, for these systems to be practically useful, their performance needs to be tested in an environment similar to real practice, which requires not only classifying malignant versus benign lesion, but also distinguishing skin cancer from numerous other skin disorders including inflammatory and infectious conditions,” explained lead investigator Jung-Im Na, MD, Ph.D., Department of Dermatology, Seoul National University, Seoul, Korea.
Using a “convolutional neural network,” a specialized AI algorithm, investigators developed an AI system capable of predicting malignancy, suggesting treatment options, and classifying skin disorders. Investigators collected 220,000 images of Asians and Caucasians with 174 skin diseases and trained neural networks to interpret those images. They found that the algorithm could diagnose 134 skin disorders and suggest primary treatment options, render multi-class classification among disorders, and enhance the performance of medical professionals through Augmented Intelligence. Most prior studies have been limited to specific binary tasks, such as differentiating melanoma from nevi.
The algorithm’s performance was initially compared with the performance of 21 dermatologists, 26 dermatology residents, and 23 members of the general public. Its performance was similar to that of the dermatology residents but slightly below that of the dermatologists. After the initial test, the test participants were informed of the results of the algorithm and subsequently modified their answers. The sensitivity of the malignancy diagnosis of the 47 clinicians improved from 77.4 percent to 86.8 percent. Similarly, the sensitivity of the diagnosis of malignancy by the 23 members of the general public improved markedly from 47.6 percent to 87.5 percent. Notably, based on the initial result, half of the malignancies would have been missed by the general public without referral to specialists.
“Our results suggest that our algorithm may serve as an Augmented Intelligence that can empower medical professionals in diagnostic dermatology,” noted Dr. Na. “Rather than AI replacing humans, we expect AI to support humans as Augmented Intelligence to reach diagnoses faster and more accurately.”
The researchers caution that AI cannot definitively interpret images that it is not trained to interpret even when the problem presented is straightforward. For example, an algorithm trained only to differentiate between melanoma and nevi cannot differentiate between an image of a nail hematoma and either a melanoma or a nevus. If the shape of the hematoma is irregular, the algorithm may diagnose it as melanoma. They also point out that the algorithm was trained and tested using high quality images and its performance is generally suboptimal if the input images are of low quality.
In addition, a diagnosis made with only one image with the most optimal composition may present inherent limitations compared to diagnoses made in a clinical setting. In a real practice, a dermatological diagnosis is made based on the combination of multiple sources of information including past medical history, symptoms, appearance compared to other lesions on the patient and the texture of the lesion assessed by physical contact.
“We anticipate that the use of our algorithm with a smartphone could encourage the public to visit specialists for cancerous lesions such as melanoma that might have been neglected otherwise,” commented Dr. Na. “However, there are issues with the quality or composition of photographs taken by the general public that may affect the results of the algorithm. If the algorithm’s performance can be reproduced in the clinical setting, it will be promising for the early detection of skin cancer with a smartphone. We hope that future studies will evaluate the utility and performance of our algorithms in a clinical setting.”
An early demo version of the team’s deep learning approach is available via its website. By analyzing data through the website, the researchers hope to identify possible problems that could still arise if the AI were used via telemedicine, which relies more heavily upon clinical photography to diagnose skin disorders.However, such diagnoses will still need to be verified by dermatologists along with the patient’s medical history and physical examination.
Elsevier