Skip to content
/ GEFU Public

(ICCV 25) From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning

License

Notifications You must be signed in to change notification settings

wangsen99/GEFU

Repository files navigation

From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning
ICCV 2025

arXiv Checkpoints

Abstract

Low-level enhancement and high-level visual understanding in low-light vision have traditionally been treated separately. Low-light enhancement improves image quality for downstream tasks but has limited generalization. Low-light visual understanding, constrained by scarce labeled data, primarily relies on task-specific domain adaptation, which lacks scalability. To address these challenges, we build a generalized bridge between low-light enhancement and low-light understanding, which we term Generalized Enhancement For Understanding (GEFU). This paradigm improves both generalization and scalability. To tackle the diverse causes of low-light degradation, we propose Semantically Consistent Unsupervised Fine-tuning (SCUF). Extensive experiments demonstrate that our proposed method outperforms current state-of-the-art approaches in terms of traditional image quality as well as GEFU tasks, including classification, detection, and semantic segmentation.

Preparation

1. setup environment

conda create -n scuf python=3.10
conda activate scuf
pip install -r requirements.txt

2. unpaired dataset and test data

We use unpaired images from EnlightenGAN to train our model and test on LSRW and LOLv2-real. The dataset can be found in SCUF. Put it in data/unpaired_data and file structure should look like:

GEFU
├── ...
├── data
│   ├── unpaired_data
│   │   ├── trainA
│   │   ├── trainB
│   │   ├── test_lsrw
│   │   ├── ...
│   │   ├── enlight_high.csv
│   │   ├── enlight_low.csv
│   │   ├── fixed_prompt_a.txt
│   │   ├── fixed_prompt_b.txt
├── gefu_eval
│   ├── classification
│   ├── detection
│   ├── segmentation
├── retinexnet
├── src

3. generate captions

We have already generated the captions for the images. If you want to generate the captions yourself, you can use the following command:

python generate_captions.py \
  --dataroot data/unpaired_data/trainA --clip_model_path ./clip_model \
  --save_path './' --output_name trainA_captions.csv --gpu 0

Training and test

1. train

accelerate launch --main_process_port 29512 src/train.py \
    --pretrained_diffusion_model="stabilityai/sd-turbo" --pretrained_retinexnet_path "retinexnet/" \
    --dataset_folder "data/unpaired_data" --train_img_prep "randomcrop_hflip" --val_img_prep "no_resize" \
    --learning_rate="1e-5" --max_train_steps=25000 --train_batch_size=1 --gradient_accumulation_steps=1 \ 
    --output_dir="./output/" --report_to "wandb" --tracker_project_name "scuf" --enable_xformers_memory_efficient_attention

2. test

Download our fine-tuned model SCUF, we also provide Our Results on all tasks.

python src/inference.py --model_path "test_model.pkl" \
    --input_file "data/unpaired_data/test_lsrw" \
    --prompt "natural light, bright lighting, soft illumination, high light, evenly lit, clear visibility, daylight, bright atmosphere, well-lit photo." \
    --direction "a2b" --output_dir "output/results" --image_prep "no_resize"

For IQA test, use script python src/iqa_eval.py.

For GEFU test, please see gefu_eval.

Acknowledgement

We would like to thank the authors of img2img-turbo, IP-Adapter and ZeroShotDayNightDA for their open-source release.

Citation

If you find this repo useful for your research, please consider citing our paper:

@article{wang2025enhancement,
  title={From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning},
  author={Wang, Sen and Zeng, Shao and Gu, Tianjun and Zhang, Zhizhong and Zhang, Ruixin and Ding, Shouhong and Zhang, Jingyun and Wang, Jun and Tan, Xin and Xie, Yuan and Ma, Lizhuang},
  journal={arXiv preprint arXiv:2507.08380},
  year={2025}
}

About

(ICCV 25) From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published