1–5 Sept 2025
ETH Zurich
Europe/Zurich timezone

Fast Synthetic X-Ray Generation for AI Diagnostics: DALL·E vs. Stable Diffusion

Not scheduled
20m
HIT G floor (gallery)

HIT G floor (gallery)

Speaker

RUKSHAK KAPOOR

Description

Medical imaging is foundational to clinical diagnostics and biomedical research, enabling the identification and monitoring of a wide range of conditions—from pulmonary diseases to cancer. However, the development of high-performance AI diagnostic systems is often hampered by restricted access to large, diverse, and well-annotated imaging datasets. This limitation is particularly acute for rare diseases, where data scarcity is compounded by patient privacy regulations, high acquisition costs, and the need for expert annotation.

To address these challenges, synthetic medical image generation has emerged as a compelling approach. By producing diagnostically relevant artificial samples, generative AI models can supplement real-world datasets, thereby improving model robustness, class balance, and training efficiency—without compromising data privacy.

In this work, we propose a comparative evaluation of two state-of-the-art generative models—DALL·E, a proprietary transformer-based text-to-image model developed by OpenAI, and Stable Diffusion, a leading open-source latent diffusion model. Our focus is on the generation of high-fidelity synthetic chest X-ray images for respiratory disease categories including COVID-19, tuberculosis, and pneumonia. These disease domains are selected due to their clinical significance and the frequent lack of balanced, high-quality data in public repositories.

Our methodology involves generating condition-specific images using both models based on curated prompts or class labels and integrating these synthetic images into the training pipelines of deep learning classifiers. We will evaluate the diagnostic performance of models trained on real-only data, synthetic-only data, and hybrid datasets, using metrics such as accuracy, precision-recall, and F1-score. Additionally, image quality assessments will include perceptual metrics (e.g., SSIM, FID) and expert review, where feasible.

Beyond performance evaluation, we will examine trade-offs between model accessibility, customization capabilities, generation speed, and deployment efficiency, highlighting practical considerations for integrating such tools into medical ML pipelines. Special attention will be given to the feasibility of using these generative models in fast and resource-constrained environments.

By framing synthetic medical image generation through the lens of fast and scalable machine learning, this study contributes to the broader goal of developing privacy-conscious, reproducible, and robust diagnostic systems, especially in data-scarce settings. The resulting code, models, and analysis pipeline will be made publicly available to support future research and collaborative efforts in the medical AI community.

Author

Presentation materials

There are no materials yet.