Come-Closer-Diffuse-Faster
Hyungjin Chung, Byeongsu Sim, Jong Chul Ye
Accelerating Conditional Diffusion Models for
Inverse Problems through Stochastic Contraction
Diffusion models are slow (Unconditional)
𝒙𝑖+1 = 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖
Diffusion models are slow (Conditional)
𝒙𝑖+1
′
= 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖
𝒙𝑖+1 = 𝑨𝒙𝑖+1
′
+ 𝒃
Why use the whole process?
Is this part necessary?
Forward diffusion initialization
𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
• 𝑡0 = 0.3
• 𝑁′
= 𝑡0𝑁 = 300
use this much
𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
• 𝑡0 = 0.3
• 𝑁′
= 𝑡0𝑁 = 300
use this much
𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
Forward diffusion initialization
Come-Closer-Diffuse-Faster
𝒙0
𝒙0
𝒙𝑁′
Shortcut exists?
Possible path?
Come-Closer-Diffuse-Faster
𝜀0 ∶= ||𝒙0 − 𝒙0||2
Come-Closer-Diffuse-Faster
𝜀𝑁′ ∶= 𝔼||𝒙𝑁′ − 𝒙𝑁′||2
Come-Closer-Diffuse-Faster
𝜀0,𝑟 ∶= 𝔼||𝒙0,𝑟 − 𝒙0,𝑟||2
Theoretical Findings
Theorem 1.
𝜀0,𝑟 ≤
2𝐶𝜏
1 − 𝜆2
+ 𝜆2𝑁′
𝜀𝑁′
Error decreases exponentially
with reverse diffusion!
𝜏 =
Tr(𝑨𝑇
𝑨)
𝑛
𝜆
𝐶
Theoretical Findings
Theoretical Findings
Theorem 2. (shortcut path)
𝜀0,𝑟 ≤ 𝜇𝜀0
• For any 0 < 𝜇 ≤ 1, there exists a minimum 𝑁′ s.t.
• Optimal 𝑁′ decreases as 𝜀0 gets smaller
Theoretical Findings
Theoretical Findings
Feed-forward network
correction
Empirical Results: Super-resolution (SR)
20 step diffusion
𝑁 = 100, 𝑡0 = 0.2
• ILVR, SR3
• proposed
𝑁 = 20, 𝑡0 = 1.0
Empirical Results: Super-resolution (SR)
Empirical Results: Inpainting
20 step diffusion
𝑁 = 100, 𝑡0 = 0.2
• Score-SDE
• proposed
𝑁 = 20, 𝑡0 = 1.0
Empirical Results: CS-MRI
20 step diffusion
𝑁 = 1000, 𝑡0 = 0.02
• Chung & Ye, 2022
• proposed
𝑁 = 1000, 𝑡0 = 1.0
Thank you!
Hyungjin Chung Byeongsu Sim Jong Chul Ye
hj.chung@kaist.ac.kr byeongsu.s@kaist.ac.kr jong.ye@kaist.ac.kr

More Related Content

PDF
変分推論法(変分ベイズ法)(PRML第10章)
PDF
VisualSFMとMeshLabとCloudCompareによるドローン撮影画像を用いたデジタル地図作成
PPTX
Superpixel Sampling Networks
PDF
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
PDF
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
PPTX
PILCO - 第一回高橋研究室モデルベース強化学習勉強会
PDF
グラフニューラルネットワーク入門
PPTX
[DL輪読会]Deep High-Resolution Representation Learning for Human Pose Estimation
変分推論法(変分ベイズ法)(PRML第10章)
VisualSFMとMeshLabとCloudCompareによるドローン撮影画像を用いたデジタル地図作成
Superpixel Sampling Networks
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
[第2回3D勉強会 研究紹介] Neural 3D Mesh Renderer (CVPR 2018)
PILCO - 第一回高橋研究室モデルベース強化学習勉強会
グラフニューラルネットワーク入門
[DL輪読会]Deep High-Resolution Representation Learning for Human Pose Estimation

What's hot (20)

PPTX
Sliced Wasserstein距離と生成モデル
PDF
論文紹介 Anomaly Detection using One-Class Neural Networks (修正版
PDF
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
PDF
動画認識サーベイv1(メタサーベイ )
PPTX
SfM Learner系単眼深度推定手法について
PPTX
Cesiumを動かしてみよう
PPTX
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
PPTX
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
PPTX
PredCNN: Predictive Learning with Cascade Convolutions
PDF
よくわかるフリストンの自由エネルギー原理
PDF
【DL輪読会】Toward Fast and Stabilized GAN Training for Highfidelity Few-shot Imag...
PPTX
Cesiumを用いた3次元リアルタイムデータの可視化について
PPTX
[DL輪読会]相互情報量最大化による表現学習
PDF
[DL輪読会]"Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,0...
PDF
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
PDF
文献紹介:TSM: Temporal Shift Module for Efficient Video Understanding
PPTX
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
PDF
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
PDF
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
PPTX
[DL輪読会]When Does Label Smoothing Help?
Sliced Wasserstein距離と生成モデル
論文紹介 Anomaly Detection using One-Class Neural Networks (修正版
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
動画認識サーベイv1(メタサーベイ )
SfM Learner系単眼深度推定手法について
Cesiumを動かしてみよう
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
PredCNN: Predictive Learning with Cascade Convolutions
よくわかるフリストンの自由エネルギー原理
【DL輪読会】Toward Fast and Stabilized GAN Training for Highfidelity Few-shot Imag...
Cesiumを用いた3次元リアルタイムデータの可視化について
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]"Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,0...
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
文献紹介:TSM: Temporal Shift Module for Efficient Video Understanding
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
[DL輪読会]When Does Label Smoothing Help?
Ad

More from Chung Hyung Jin (11)

PPTX
DiffusionMBIR_presentation_slide.pptx
PPTX
BlindDPS_presentation_silde.pptx
PDF
diffusion_posterior_sampling_for_general_noisy_inverse_problems_slideshare.pdf
PPTX
mr-denoising-and-super-resolution-using-regularized-reverse-diffusion.pptx
PPTX
Low-dose sparse-view HAADF-STEM-EDX tomography of nanocrystals using unsuperv...
PPTX
Score-based diffusion models for accelerated MRI.pptx
PPTX
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
PPTX
Missing cone artifact removal in odt using unsupervised deep learning in the ...
PPTX
A deep learning model for diagnosing gastric mucosal lesions using endoscopic...
PPTX
Feature disentanglement in generating a three dimensional structure from a tw...
PPTX
Unsupervised deep learning methods for biological image reconstruction and en...
DiffusionMBIR_presentation_slide.pptx
BlindDPS_presentation_silde.pptx
diffusion_posterior_sampling_for_general_noisy_inverse_problems_slideshare.pdf
mr-denoising-and-super-resolution-using-regularized-reverse-diffusion.pptx
Low-dose sparse-view HAADF-STEM-EDX tomography of nanocrystals using unsuperv...
Score-based diffusion models for accelerated MRI.pptx
Two stage deep learning for accelerated 3 d time-of-flight mra without matche...
Missing cone artifact removal in odt using unsupervised deep learning in the ...
A deep learning model for diagnosing gastric mucosal lesions using endoscopic...
Feature disentanglement in generating a three dimensional structure from a tw...
Unsupervised deep learning methods for biological image reconstruction and en...
Ad

Recently uploaded (20)

PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PDF
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
PPTX
Amdahl’s law is explained in the above power point presentations
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PDF
Design Guidelines and solutions for Plastics parts
PDF
737-MAX_SRG.pdf student reference guides
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PDF
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
PDF
Soil Improvement Techniques Note - Rabbi
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
22EC502-MICROCONTROLLER AND INTERFACING-8051 MICROCONTROLLER.pdf
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
Amdahl’s law is explained in the above power point presentations
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
Design Guidelines and solutions for Plastics parts
737-MAX_SRG.pdf student reference guides
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Exploratory_Data_Analysis_Fundamentals.pdf
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Improvement effect of pyrolyzed agro-food biochar on the properties of.pdf
Soil Improvement Techniques Note - Rabbi
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Categorization of Factors Affecting Classification Algorithms Selection
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
22EC502-MICROCONTROLLER AND INTERFACING-8051 MICROCONTROLLER.pdf
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...

Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction.pptx

Editor's Notes

  • #2: Hi, I’m Hyungjin Chung, and I will be presenting the work come-closer-diffuse-faster, which is a work aimed for accelerating diffusion models for inverse problems.
  • #3: Diffusion models are slow. They define an iterative chain, starting from pure Gaussian noise, slowly denoised to form a high fidelity image. Note that the diffusion process takes few thousand iterations to perform well. Writing in terms of stochastic difference equation, one can write as follows, where z represents some gaussian noise.
  • #4: Diffusion models are very flexible such that the unconditional models can be used directly to solve inverse problems without any finetuing. Just by imposing data consistency constraints in-between, one can sample from some conditional distribution given a measurement. However, again, the process is slow.
  • #5: When we inspect the diffusion process, specifically of conditional diffusion, it seems odd that almost half of the reverse diffusion process does not contain any relevant information about the reconstruction. So we question: Is this part necessary?
  • #6: Let us assume that the total discretization step, N, is set to one thousand, and N’ is defined as t_0 times N. For solving inverse problems, recall that we have the corrupted image x_0, and due to the linearity of the forward diffusion, one can sample from the intermediate step at N’ with a single reparameterization step.
  • #7: When we compare this to the forward diffusion of the ground truth image, tilde x_0, the two images look very similar.
  • #8: Our intuition is that starting the reverse diffusion step from x N’ will eventually lead us to tilde x_0, given that our model is properly trained.
  • #9: In order to study this property in a more rigorous fashion, let us first define epsilon_0, standing for the error between the ground truth, and the measurement.
  • #10: Define bar epsilon N’ as the expected distance between the two images after the forward diffusion.
  • #11: Then, we finally define bar epsilon 0, reverse, as the expected distance between the two images after the reverse diffusion.
  • #12: Then, our first theorem tells us that the error bar epsilon 0, reverse, will exponentially decrease with the reverse diffusion, according to the theory of stochastic contraction. Here, the specific values for lambda, C, and tau is defined by the type of diffusion process you use, and the ill-posedness of the problem, respectively.
  • #13: Visually, we can plot the error as the reverse diffusion progresses, as follows. This will be the case when we use full reverse diffusion.
  • #14: Our second theorem, which is our main theorem, states that for any mu between 0 and 1, there exists a minimum N’ such that final error is bounded by mu times epsilon 0. Moreover, such optimal N’ will decrease, as epsilon_0, implying the initial error of the problem, gets smaller.
  • #15: Visually speaking, the error will polynomially increase with the forward diffusion initialization, and decrease exponentially with the reverse diffusion, such that we can find an optimal value of N’, or t_0.
  • #16: Even better, if we have a feed-forward network that is trained for a specific task, for example, super resolution, we can directly start from that estimate, and use even smaller number of diffusion steps, as in this figure.
  • #17: Using our strategy, we show that we can achieve state-of-the-art reconstruction results for various types of imaging problems. First, for super-resolution, we achieve high quality reconstructions using only 20 step diffusion, where as other diffusion based methods such as SR3 or ILVR suffer heavily.
  • #18: Plotting the FID score, we again observe that CCDF consistently performs better than ILVR under the same budget.
  • #19: We see similar high quality results with 20 step diffusion with image inpainting, where we see that the strategy adopted in, for example, score-sde fails, not being able to remove noise properly.
  • #20: Our final application shows that our strategy can also be directly applied to compressed sensing MRI, which is supposedly the most practical, since fast reconstruction is of utmost importance in medical imaging. We show that we can even outperform the strategy that uses full reverse diffusion, with our 20 step diffusion counterpart.
  • #21: Thank you for your attention.