Bottom Line
For Patients: 5 Common Questions
Q1. Are smartphone mole-check apps accurate?
Available apps vary widely and most lack rigorous prospective clinical validation. The DEXI system used in this study runs on professional dermoscopy images plus a commercial AI workflow — very different from a casual phone photo. Apps can prompt you to see a dermatologist, but they cannot replace clinical diagnosis.
Q2. Can AI safely rule out melanoma for me?
Not yet. In this 2026 JAAD study, DEXI misclassified 10/114 dermoscopic images; critically, 5 out of 20 melanomas were classified as ordinary nevi (25% miss rate). That is not safe enough for AI to act as a standalone rule-out tool. If a mole changes (growth, asymmetry, irregular border, color variation, bleeding, elevation), see a dermatologist.
Q3. AI looks at the same areas dermatologists do — does that mean AI is reading skin like a human?
A heat map shows regions associated with model output, but it does not prove the model uses the same clinical reasoning. Prior work has shown AI being misled by ruler marks, hair, or illumination artifacts. "Heat maps look like dermatologists" is a promising starting point, not a finish line.
Q4. Can I use generative AI images to compare my rash at home?
No. Lipner notes in the same JAAD Reviews issue that generative AI images in dermatology still struggle with skin-tone representativeness and morphologic accuracy. They suit teaching material and data augmentation — not lay self-diagnosis.
Q5. Will future visits be AI-only?
Realistically, AI is heading toward a "second set of eyes" role. AI can flag regions for a closer look or help triage higher-risk lesions, but history, holistic visual assessment, and biopsy when needed remain physician tasks. AI changes the workflow, not the role of the dermatologist.
- Lipner's JAAD Reviews commentary frames dermatology AI along two active fronts: diagnosis / image analysis and medical education / data augmentation.
- Kremer et al compared eye-tracking heat maps from 4 dermatologists with DEXI AI heat maps across 114 dermoscopic images.
- The median dermatologist-DEXI pixel-wise correlation was r = 0.540, approaching inter-dermatologist agreement at r = 0.591, and higher than null comparisons at r = 0.434.
- DEXI misclassified 10/114 images; importantly, 5/20 melanomas were misclassified as nevi. This should prevent any over-reading that AI can independently rule out melanoma.
- Generative AI may support medical education and data augmentation, but Lipner emphasizes ongoing problems with bias, representativeness, and domain-specific accuracy.
Three Current Roles for AI in Dermatology
If we reduce dermatology AI to "take a phone photo of a mole and get a melanoma answer," we miss the real landscape. Lipner's 2026 JAAD Reviews commentary links two complementary directions: explainable diagnostic AI and generative models for education and data augmentation.
| Role | Reasonable use today | Main risk |
|---|---|---|
| Diagnostic support | Analyze dermoscopic images as a second-read or risk-stratification aid. | Dataset bias, image-quality variation, underrepresentation of rare patterns, and missed melanoma. |
| Explainability tool | Use heat maps or saliency maps to ask whether the model is attending to clinically meaningful areas. | Heat maps are not causal proof; different methods may produce different maps and may attend to artifacts. |
| Education and data augmentation | Create teaching material, augment rare-disease imagery, and support training sets. | Generated images may perpetuate bias and may not preserve true morphologic or pathologic features. |
This article is about AI roles and research methodology, not drug treatment. It therefore does not include Taiwan NHI drug reimbursement criteria. If a future article discusses melanoma treatment, immune checkpoint inhibitors, or BRAF/MEK targeted therapy, reimbursement criteria should be checked separately against pathology, staging, BRAF testing, ECOG status, imaging, and prior-authorization requirements.
How Was the Kremer 2026 JAAD Study Designed?
Kremer et al asked a focused question: do AI-generated heat maps in dermoscopy highlight the same regions that dermatologists visually inspect? Four dermatologists, blinded to diagnosis, reviewed dermoscopic images while their eye movements were tracked. The same images were analyzed by DEXI to generate class activation maps, and the overlap was measured using pixel-wise rank correlation.
| Design element | Details |
|---|---|
| Images and lesion types | Mainly HAM10000, with a small additional contribution from MSKCC and BCN200000; melanoma, BCC, SCC, nevi, benign keratoses, and vascular lesions. |
| Final analysis set | Six images were excluded for technical reasons, leaving 114 images: 60 benign and 54 malignant. |
| Readers | Four dermatologists, including one dermoscopy expert with over 35 years of experience and three younger dermatologists. |
| AI system | DEXI (Dermoscopy EXplainable Intelligence), implemented through Vectra software. |
| Main analysis | Pixel-wise rank correlation between dermatologist gaze maps and DEXI maps; inter-dermatologist correlations served as the upper reference, and non-homologous pairings as the lower reference. |
How Should We Read the Main Numbers?
The most conservative interpretation is that DEXI heat maps substantially overlap with dermatologist visual attention and approach the agreement seen among dermatologists themselves. This supports potential interpretability: the model is not simply attending to areas obviously unrelated to human diagnostic inspection.
It does not mean that AI diagnoses like a dermatologist. DEXI misclassified 10/114 images; 5/20 melanomas were misclassified as nevi. That is clinically consequential and should prevent any conclusion that AI can independently exclude melanoma.
Why did gaze maps differ from fixation maps?
Dermatologist-DEXI correlation was higher for gaze maps than fixation maps (about r = 0.53 vs r = 0.46, P < .001). The authors suggest that diagnostically important anchors may be seen briefly during early global visual processing, whereas later fixation patterns may reflect individual confirmation strategies and therefore vary more between readers.
Why was overlap higher in incorrectly diagnosed lesions?
Dermatologist-DEXI heat map correlation was higher for incorrectly diagnosed lesions than correctly diagnosed lesions (r = 0.568 vs 0.521). The likely interpretation is not "higher overlap equals greater accuracy," but rather that difficult lesions may trigger longer and broader visual search, creating more overlap with AI maps.
Limits: Plausible Heat Maps Are Not Causal Explanations
The most important methodological warning is that a heat map is not a recording of model reasoning. It can show regions associated with model output, but it does not prove that the model used the same dermoscopic logic as a human reader. Kremer et al note that different saliency methods can yield different maps, and prior work has shown that models may attend to image artifacts rather than lesion features.
- Small sample size within each lesion subtype.
- Lesion size data were unavailable and may have influenced correlations.
- Heat-map overlap does not prove shared clinical reasoning.
- The study did not test whether AI improved clinician diagnostic accuracy, confidence, or patient outcomes.
- DEXI is a specific commercial AI system and workflow; the results should not be generalized to every dermoscopy app.
Resident Takeaways for Journal Club
1. This is an explainability paper, not primarily an accuracy paper
The key question is not "Is DEXI accurate?" but "Does DEXI attend to areas that human dermatologists visually inspect?"
2. Explainability is a clinical adoption threshold
If a model only gives a score, clinicians cannot tell whether it noticed pigment network, asymmetry, border irregularity, color heterogeneity, or whether it was distracted by ruler marks, hair, illumination, or other artifacts.
3. Generative AI may help education, but dermatologists must audit it
Lipner's summary of text-to-image models is appropriately cautious: they may help rare disease imaging, data augmentation, and medical education, but bias and domain-specific accuracy remain unresolved.
Clinical Position: AI as a Second Set of Eyes
At present, AI is best used as a second set of eyes: to flag regions worth review, support teaching, and make model output more auditable. For patients, AI apps cannot replace history, clinical examination, dermoscopy, longitudinal follow-up, and biopsy when indicated. For clinicians, AI should sharpen rather than replace dermoscopy training.
References
- Lipner SR. Highlights from JAAD Reviews: May 2026 - Artificial intelligence: Where are we now? J Am Acad Dermatol. 2026;94(5):1434-1435. doi:10.1016/j.jaad.2026.02.070.
- Kremer N, Polo-Silveira L, Bajaj S, et al. Comparing dermatologists' and artificial intelligence heat maps in dermoscopic image analysis via eye tracking. J Am Acad Dermatol. 2026;94(5):1461-1468. doi:10.1016/j.jaad.2025.12.104.
- Chanda T, Haggenmueller S, Bucher TC, et al. Dermatologist-like explainable AI enhances melanoma diagnosis accuracy: eye-tracking study. Nat Commun. 2025;16(1):4739. doi:10.1038/s41467-025-59532-5.
- Hauser K, Kurz A, Haggenmueller S, et al. Explainable artificial intelligence in skin cancer recognition: a systematic review. Eur J Cancer. 2022;167:54-69. doi:10.1016/j.ejca.2022.02.025.
- Brancaccio G, Balato A, Malvehy J, Puig S, Argenziano G, Kittler H. Artificial intelligence in skin cancer diagnosis: a reality check. J Invest Dermatol. 2024;144(3):492-499. doi:10.1016/j.jid.2023.10.004.
- Giavina-Bianchi M, Vitor WG, Fornasiero de Paiva V, Okita AL, Sousa RM, Machado B. Explainability agreement between dermatologists and five visual explanations techniques in deep neural networks for melanoma AI classification. Front Med (Lausanne). 2023;10:1241484. doi:10.3389/fmed.2023.1241484.