Dermoscopic Algorithms Lack Reliability for Detecting Melanoma

April 13, 2016
Leah Lawrence
Leah Lawrence

A new study has shown that algorithms of dermoscopic criteria used to detect melanoma had only modest levels of accuracy and lacked interobserver agreement among a group of regular dermoscopy users.

A new study has shown that algorithms of dermoscopic criteria used to detect melanoma had only modest levels of accuracy and lacked interobserver agreement among a group of regular dermoscopy users. Among six common algorithms examined, none seemed to be easy to learn or reliable.

“Our results confirm the need to further improve dermoscopic terminology, criteria, and algorithms,” wrote Cristina Carrera, MD, PhD, of Memorial Sloan Kettering Cancer Center, and colleagues in JAMA Dermatology. “To do so, future studies may benefit from crowd-sourcing and collective intelligence approaches, as well as the public image archive being created in the International Skin Imaging Collaboration Melanoma Project, which permits analysis and comparison of the areas within a lesion that users select as having unique dermoscopic structures.”

According to the study, the use of dermoscopy improves diagnostic accuracy compared with a naked eye examination alone; however, diagnosis made by trained users are often made without the use of structured analytical criteria. For new dermoscopy users, several simplified diagnostic algorithms exist-such as the ABCD rule, the Menzies method, or the 3-point checklist. In this study, Carrera and colleagues wanted to compare the diagnostic accuracy of these varied dermoscopic algorithms.

In the study, participants were randomly assigned to evaluate 1 of 12 image sets that consisted of 39 or 40 images of melanoma and nevi. Participants included physicians, residents, and medical students who were directed to the study through a website. Of the 240 participants who registered, 130 participants evaluated at least 20 images and 121 of these participants were regular dermoscopy users.

“Experts usually do not apply algorithms. In other words, evaluators may assign a diagnosis based on the overall impression of a lesion and then search for criteria to fit their decision,” the researchers explained. “To avoid this potential bias, participants in our study evaluated the presence and absence of dermoscopic features but did not apply an algorithm or make a diagnosis.”

The study revealed several criteria associated with melanoma (P < .001 for all):

Marked architectural disorder (odds ratio [OR], 6.6; 95% CI, 5.6–7.8),

Pattern asymmetry (OR, 4.9; 95% CI, 4.1–5.8),

Nonorganized pattern (OR, 3.3; 95% CI, 2.9–3.7),

Border score of 6 (OR, 3.3; 95% CI, 2.5–3.4), and

Contour asymmetry (OR, 3.2; 95% CI, 2.7–3.7).

The researchers identified only a few criteria among the algorithms that met moderate levels of interobserver agreement.

“Criteria with the highest levels of discriminatory power and interobserver agreement included features not always highlighted in existing algorithms, such as comma vessels and absence of vessels, as well as subjective features that quantify the overall organization of a lesion, namely, architectural disorder and symmetry of pattern and contour,” the researchers wrote.

The researchers looked at the diagnostic accuracy for the ABCD rule, the Menzies method, the 7-point checklist, the 3-point checklist, chaos and clues, and CASH. Of all of the algorithms examined, the Menzies method was found to have significantly higher sensitivity for detecting melanoma (95.1%) compared with any other method (P < .001), but it also had the lowest specificity (24.8%). In contrast, the ABCD rule had the highest specificity (59.4%).

“We hope these efforts will lead to a unified dermoscopy algorithm, automated detection of criteria, and clinical decision support systems that facilitate population-based melanoma screening effort,” the researchers concluded.