
Abstract
Mammography is the gold standard for the detection and diagnosis of breast cancer. This procedure can be significantly enhanced with Artificial Intelligence (AI)-based software, which assists radiologists in identifying abnormalities. However, training AI systems requires large and diverse datasets, which are often difficult to obtain due to privacy and ethical constraints. To address this issue, the paper introduces MAMmography ensemBle mOdel (MAMBO), a novel patch-based diffusion approach designed to generate full-resolution mammograms. Diffusion models have shown breakthrough results in realistic image generation, yet few studies have focused on mammograms, and none have successfully generated high-resolution outputs required to capture fine-grained features of small lesions. To achieve this, MAMBO integrates separate diffusion models to capture both local and global (image-level) contexts. The contextual information is then fed into the final model, significantly aiding the noise removal process. This design enables MAMBO to generate highly realistic mammograms of up to 3840×3840 pixels. Importantly, this approach can be used to enhance the training of classification models and extended to anomaly segmentation. Experiments, both numerical and radiologist validation, assess MAMBO's capabilities in image generation, super-resolution, and anomaly segmentation, highlighting its potential to enhance mammography analysis for more accurate diagnoses and earlier lesion detection.

Synthetic 3840x3840 mammogram generated using MAMBO. Details at different resolutions correspond to the global context (whole image), local context, and individual patch.
Method

In the three-stage approach of MAMBO, the first stage of the model generates a novel global context x0G, which is then used to generate a set of local contexts x0L in the second stage, conditioned on shifted global contexts →x0G. Synthetic global context and synthetic local contexts become the conditioning in the third stage to generate highly detailed patches x0P, which are finally combined to obtain a high-resolution full mammogram.
Examples


Expert radiologist annotations on MAMBO generated images: masses and calcifications.


Synthetic full-resolution images (3840 x 3840 pixels) generated using MAMBO trained on the RSNA dataset.


Synthetic full-resolution images (3840 x 3840 pixels) generated using MAMBO trained on the VinDr dataset.
BibTeX
@article{vskipina2025mambo,
title={MAMBO: High-Resolution Generative Approach for Mammography Images},
author={{\v{S}}kipina, Milica and Jovi{\v{s}}i{\'c}, Nikola and Dall'Asen, Nicola and {\v{S}}venda, Vanja and Tur, Anil Osman and Ili{\'c}, Slobodan and Ricci, Elisa and {\'C}ulibrk, Dubravko},
journal={arXiv preprint arXiv:2506.08677},
year={2025}
}