Veronica Rotemberg # 1, Nicholas Kurtansky # 2, Brigid Betz-Stablein 3, Liam Caffery 3, Emmanouil Chousakos 2 4, Noel Codella 5, Marc Combalia 6, Stephen Dusza 2, Pascale Guitera 7, David Gutman 8, Allan Halpern 2, Brian Helba 9, Harald Kittler 10, Kivanc Kose 2, Steve Langer 11, Konstantinos Lioprys 4, Josep Malvehy 6, Shenara Musthaq 2 12, Jabpani Nanda 2 13, Ofer Reiter 2 14, George Shih 15, Alexander Stratigos 4, Philipp Tschandl 10, Jochen Weber 2, H Peter Soyer 3
Abstract
Prior skin image datasets have not addressed patient-level information obtained from multiple skin lesions from the same patient. Though artificial intelligence classification algorithms have achieved expert-level performance in controlled studies examining single images, in practice dermatologists base their judgment holistically from multiple lesions on the same patient. The 2020 SIIM-ISIC Melanoma Classification challenge dataset described herein was constructed to address this discrepancy between prior challenges and clinical practice, providing for each image in the dataset an identifier allowing lesions from the same patient to be mapped to one another. This patient-level contextual information is frequently used by clinicians to diagnose melanoma and is especially useful in ruling out false positives in patients with many atypical nevi. The dataset represents 2,056 patients (20.8% with at least one melanoma, 79.2% with zero melanomas) from three continents with an average of 16 lesions per patient, consisting of 33,126 dermoscopic images and 584 (1.8%) histopathologically confirmed melanomas compared with benign melanoma mimickers.