25 February 2026
While approximately 1 in 5 Americans will develop some form of skin cancer by the age of 70 [src], when caught early, skin cancer is one of the most highly preventable and treatable forms of cancer [src].
Simultaneously, we are seeing increasing concerns about sensitive data leakage and the need for greater control over the geographical location where data is processed. These factors collectively highlight the necessity of minimizing the data footprint.
The Dermatolog AI Scan application offers a solution to these challenges by implementing a lesion scanner that assesses tumor risk while ensuring user privacy. The application utilizes the newest MedGemma AI model family, specifically MedSigLIP for image classification, and stores absolutely zero user data on the server.
Additionally it allows the user to freely install on any geographic locations, or even execute AI inference locally. The solution is also designed as a template for next-generation clinical applications with privacy prioritized.
Overall solution
The core of the application is the MedSigLIP model embedded into one page application of stateless design with several privacy focused characteristics. Application performs classification to identify 11 dermatological conditions (that can be easily extended).
MedSigLIP also serves to generate Grad-CAM heatmaps (Saliency Maps) to visually indicate the specific areas of an image the AI focused on when making a prediction, thereby contributing to Explainable AI.
Results are interpreted for a quick assessing confidence with a special case for malignant tumor and to provide high-confidence malignancy reports even if the model does not decide on the exact cancer type.
Privacy-first capabilities
- Images are processed as DataURLs in the browser's memory; the backend processes them entirely in-memory without ever writing image files to the server's disk.
- Session-based isolation: each user is assigned a unique session, ensuring that their images and analysis results are isolated and disappear when the session is cleared or the browser tab is closed.
- No need for authentication.
- Local or server deployment options: designed to run locally or on private cloud instances to maintain patient data sovereignty.
Bit more on technical details
- Local inference with google/medsiglip-448 MedGemma family model.
- Automatic lesion detection with YOLOv8-Nano for smart cropping aimed for optimized preprocessing image format.
- One page, all device friendly, with timeline navigation.
- Supports drag-and-drop, direct clipboard pasting, and device camera support.
- Debug mode with calibration settings.
Lessons learned from applying MedGemma (MedSigLIP) for Melanoma detection
1. The zero-shot challenge.
MedSigLIP aligns images and text in a shared embedding space, but out-of-the-box results can be prone to false positives. To mitigate this:
Prompt engineering is mandatory.
Don't just provide a label. Attach short descriptions of the characteristics of an image. Add a contextual prefix, e.g. instead of "Photo of .." put "Dermoscopy image revealing ..."
Vocabulary matters.
Use formal medical terminology the model was trained on (like in PubMed). If you use "common" language, standard models like SigLIP 2 actually tend to outperform the specialized MedSigLIP.
Add morphological descriptors.
Add morphological descriptors detailing shape, color, border, arrangement, and texture. E.g, when MedSigLIP reads the word "Psoriasis", its text encoder activates much stronger when it also reads "erythematous plaques with silvery-white scale", because it mathematically maps those specific words to the visual features of redness and scaling it learned during pre-training.
Medical triage.
You may want to experiment with medical triage to rule out false positives (I tried but eventually get better result without)
Select classes.
Not all conditions have the same accuracy in classification. Focus your classes on high-accuracy clusters rather than a broad net.
2. Hospital vs. real world data.
MedSigLIP was primarily trained on hospital-grade, contrast-enhanced, and DICOM-derived images.
- The Issue: Mobile photos lack the controlled lighting and focus of clinical imaging.
- The Bias: Accuracy varies significantly across different skin tones. Research indicates a historical performance gap in derm-AI; for example, models trained on standard datasets (like HAM10000, which is ~95% Caucasian-centric) can see a 10-15% drop in diagnostic accuracy when transitioning from Fitzpatrick skin types I-II (fair) to types V-VI (darker tones).
3. Image preprocessing.
The model handles up to 448x448 px. To maximize accuracy:
- Use object detection to center the lesion and crop the image before feeding it to the model.
- Preserve the aspect ratio of an object on image. Since asymmetry is a primary indicator of melanoma, distortion can destroy the model's ability to "see" the pathology.
4. And the last but crucial one: privacy.
At just ~4GB for inference, we are finally reaching a point where privacy-sensitive medical images can be processed locally. No data needs to leave the user's device to get a high-quality initial screening.
Thank you
Links
- Open source code on GitHub
- Demo via GCP Cloud Run deployment
- Demo on Hugging Face
- The MedGemma Impact Challenge on Kaggle