This is the official implementation of the SAT-UNET model, as described in our paper:
GhalamClouds: Remote Sensing docs/images for Kazakhstan and Clouds Segmentation Using SAT-UNet
SAT-UNET is a deep learning model for cloud semantic segmentation, designed for remote sensing imagery. This repository provides all necessary code, configuration files, and utilities to train, validate, and infer with SAT-UNET, including multi-GPU support and experiment logging.
This repository has been updated to include methods for improving cross-dataset generalization, specifically targeting the domain shift between satellites (e.g., Landsat-8 vs. KazSTSat).
Key updates include:
-
Sat-SlideMix Augmentation: Implementation of the batch-level normalization technique adapted from Hopkins et al. (2025).
-
Percentile-Based Normalization: A robust 1-99th percentile linear normalization strategy.
Key Findings: In our recent experiments, we observed that while percentile-based normalization significantly speeds up convergence and stability, Sat-SlideMix drives the major improvements in performance metrics, while combination of these provide the best cross-dataset inference.
New SOTA: We achieved a new F1-score record of 97.87% on the Cloud-38 benchmark (surpassing the previously reported 97.69%).
-
checkpoints/
Saved model weights and checkpoints. -
configs/
YAML configuration files for experiments (e.g.,satunet.yml). -
data/
Data loading, augmentation, and dataset definitions. -
docs/
Documentation and requirements. -
models/
Model architectures, attention modules, and custom loss functions. -
utils/
Utility scripts for logging, checkpointing, patch merging, and more. -
train.py
Main training script. -
inference.py
Inference script for running predictions on new data. -
train_multi_gpu.py
Multi-GPU training is envoked by choosingmulti_gpuparameter intraining.typeof YAML configuration file. -
trainer.py
Single-GPU training is envoked by choosingtorchparameter intraining.typeof YAML configuration file.
The dataset should be organized into train, valid, and test (optional) folders.
Each split contains subfolders named after each color band (prefixes train_, valid_, test_). Example structure:
├──Dataset Root Folder
│------------├──train_red
│------------├──train_green
│------------├──train_blue
│------------├──train_nir
│------------├──train_gt
│
│------------├──valid_red
│------------├──valid_green
│------------├──valid_blue
│------------├──valid_nir
│------------├──valid_gt
Here is the visual diagram of our proposed Sat-UNet model.
Here is the visual diagram of our proposed NLP-inspired attention block for Sat-UNet model.
We conducted extensive experiments to validate the impact of our new augmentation and normalization pipeline.
Models are trained on 38-Cloud and tested on GhalamClouds80.
Note: No Norm means simple normalization, which is division by 8-bit maximum range (255).
| NIR | GT | No Norm, No Aug | Norm, No Aug | No Norm, Sat-SlideMix | Norm, Sat-SlideMix |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
All experiment parameters are set in the YAML config file, e.g., configs/satunet.yml.
To replicate experiments or run your own:
-
Model parameters:
xp_config.model_name: Model to use (e.g.,SatUNet)model.in_channels,model.out_channels: Input/output channels
-
Training parameters:
training.type: Training mode (lightning,torch, ormulti_gpu)training.multi_gpu.world_size: Number of GPUs for multi-GPU trainingtrainer.max_epochs: Number of training epochstrainer.precision: Precision (e.g.,bf16-mixed)
-
Data parameters:
data.data_root: Path to datasetdata.train_colors,data.valid_colors: Folder names for each modalitydata.train_batch_size,data.valid_batch_size: Batch sizes
-
Optimization:
model.optim_params: Optimizer settings (type, learning rate, weight decay, etc.)
-
Early Stopping
early_stop.patience: Number of epochs to stop training when validation loss doesn't improve
-
Logging:
logs.type: Logging backend (comet,tb_logs, etc.)logs.comet.api_key: Set your Comet API key (can be set via environment variableCOMET_API_KEY)- NOTE: in
train.pycomment out 35-36 lines to NOT LOG your COMET API KEY.
Change these parameters as needed to match your experimental setup.
- Python 3.12.7
- CUDA 12.1
- CUDNN 8.9.7
- Install dependencies:
pip install -r docs/requirements.txt- Run training
export COMET_API_KEY=YOUR_API_KEY
python3 train.py -p configs/satunet.yml- No thresholding (probability map)
python3 inference.py -d DataFolder -p configs/satunet.yml -c checkpoints/model.ckpt -s results- Thresholding
python3 inference.py -d DataFolder -p configs/satunet.yml -c checkpoints/model.ckpt -t 0.5 -s resultsIf you use GhalamClouds datasets and/or SAT-UNET in your research, please cite our paper:
@article{AIMYSHEV2025,
title = {GhalamClouds: Remote Sensing Images for Kazakhstan and Clouds Segmentation Using SAT-UNet},
journal = {Advances in Space Research},
year = {2025},
issn = {0273-1177},
doi = {https://doi.org/10.1016/j.asr.2025.09.098},
url = {https://www.sciencedirect.com/science/article/pii/S0273117725011299},
author = {Dias Aimyshev and Beket Tulegenov and Berik Smadyarov and Darkhan Nurzhakyp},
keywords = {Attention, Deep Learning, Image Processing, Remote Sensing, UNet},
abstract = {Cloud detection and segmentation represent a fundamental step in the analysis of optical remote sensing data. However, the performance of deep learning models heavily depends on the availability of task-specific datasets, which are often lacking. To address these challenges, we introduce two versions of the GhalamClouds dataset, derived from KazSTSat imagery, containing over 40,000 manually annotated cloud patches across six spectral channels and diverse landscapes of Kazakhstan. We also propose Sat-UNet, a U-Net backbone augmented with a spatial attention block designed to refine skip connections. Extensive experiments demonstrate that proposed model confidently competes with state-of-the-art models with significantly less parameters. Ablation studies further show that the proposed attention mechanism consistently outperforms the baseline UNet as well as SE and CBAM modules, with notable gains in F1 score, AUC, and Accuracy. We expect that GhalamClouds and Sat-UNet will provide valuable resources for advancing cloud segmentation research and improving model robustness in remote sensing applications.}
}






