Member-only story

Understanding the Denoising Diffusion Probabilistic Model (DDPMs), the Socratic Way

A deep dive into the motivation behind the denoising diffusion model and detailed derivations for the loss function

Published in

TDS Archive

69 min readFeb 25, 2023

Photo by Chaozzy Lin on Unsplash

The Denoising Diffusion Probabilistic Models by Jonathan Ho et. al. is a great paper. But I had difficulty understanding it. So I decided to dive into the model and worked out all the derivations. In this article, I will focus on the two main obstacles to understand the paper:

why is the denoising diffusion model designed in terms of the forward process, the forward process posteriors, and backward process. And what is the relationship among these processes? By the way, in this article I call the forward process posteriors “the reverse of the forward process” because I find the word “posteriors” confuses me, and/or subconsciously I want to avoid that word as it frightens me — every time it appears, things become complicated.
how to derive the mysterious loss function. In the paper, there are many skipped steps in deriving the loss function Lₛᵢₘₚₗₑ. I went through all derivations to fill in the missing steps. Now I realize the derivation of the analytical formula for Lₛᵢₘₚₗₑ tells a truly beautiful Bayesian story. And after all the steps filled in, the whole story…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Continue in app

Or, continue in mobile web

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

Published in TDS Archive

Last published Feb 3, 2025

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Written by Wei Yi

I'm leading the Deep Learning team at AstraZeneca. Previously I worked at SecondMind, Microsoft Research, and also was CTO of a hedge fund EQB.

Responses (5)
What are your thoughts?
Also publish to my profile
Saty Raghavachary
Mar 3, 2023
Excellent step by step analysis (derivation). What would rock even more, is a Colab notebook that contains the steps in executable form :)
Sergio Gaitan
Jan 24, 2024
Thank you very much! the best material I have found on the topic until now. You are a hero man!
Shekhawat Kushagra
Mar 7, 2023
Line (8) splits the integrating variables into 4 parts, corresponding to x₀, xₜ₋₁, xₜ and xₒₜₕₑᵣ, and re-orders them.
Could you please explain the formula in line 8?
1. How are there 2 integrals in line 8, when in line 7 there was only one?
2. Is q(x_t, x_other, x_0) equal to q(x_t | x_0) * q(x_other | x_0) * q(x_0) ?
Thank you

More from Wei Yi and TDS Archive

How does the Segment-Anything Model’s (SAM’s) decoder work?

In

TDS Archive

by

Wei Yi

How does the Segment-Anything Model’s (SAM’s) decoder work?

A deep dive into how the Segment-Anything model’s decoding procedure, with a focus on how its self-attention and cross-attention mechanism…

Mar 24, 2024

Are Public Agencies Letting Open-Source Software Down?

In

TDS Archive

by

Ragnvald Larsen

Are Public Agencies Letting Open-Source Software Down?

Open-source software is everywhere, yet public agencies and institutions often fall short in supporting and sustaining these projects.

Feb 3

Show and Tell

In

TDS Archive

by

Muhammad Ardi

Show and Tell

Implementing one of the earliest neural image caption generator models with PyTorch.

Feb 3

How Does the Segment-Anything Model’s (SAM’s) Encoder Work?

In

TDS Archive

by

Wei Yi

How Does the Segment-Anything Model’s (SAM’s) Encoder Work?

a deep dive into how image content embedding, sine and cosine positional embedding, guidance click embedding and dense mask embedding is…

May 14, 2024

See all from Wei Yi

See all from TDS Archive

Recommended from Medium

Paper Explained — High-Resolution Image Synthesis with Latent Diffusion Models

In

TDS Archive

by

Mario Larcher

Paper Explained — High-Resolution Image Synthesis with Latent Diffusion Models

While OpenAI has dominated the field of natural language processing with their generative text models, their image generation counterpart…

Mar 30, 2023

A Brief History of AI with Deep Learning

LM Po

A Brief History of AI with Deep Learning

Artificial intelligence (AI) and deep learning have seen remarkable progress over the past several decades, transforming fields like…

Sep 1, 2024

Lists

Predictive Modeling w/ Python

20 stories1840 saves

Practical Guides to Machine Learning

10 stories2215 saves

Natural Language Processing

1958 stories1602 saves

data science and AI

40 stories337 saves

Neon orange, pink, and purple transformer model toy jumping mid-air and shooting at the camera in an action shot. Stylized image, comic-book like. Cover image for Comet ML’s article “Explainable AI: Visualizing Attention in Transformers”

In

Generative AI

by

Abby Morgan

Explainable AI: Visualizing Attention in Transformers

And logging the results in an experiment tracking tool

Jul 17, 2023

Object detection with Vision Transformers

In

AI Innovator From PrismAI

by

Abhijat Sarari

Object detection with Vision Transformers

Object detection is a core task in computer vision, powering technologies from self-driving cars to real-time video surveillance. It…

Oct 20, 2024

Understanding the Loss Functions of Latent Diffusion Model

Preranabora

Understanding the Loss Functions of Latent Diffusion Model

Let’s start with the basics “Why do we need Generative Model?”. Think of an image as a grid of pixels. For a standard image with a size of…

Sep 18, 2024

Building a Physics-Informed Neural Network (PINN) Using PyTorch From Scratch

In

Tech Spectrum

by

Aarafat Islam

Building a Physics-Informed Neural Network (PINN) Using PyTorch From Scratch

Solving the 1D Heat Equation

Nov 12, 2024

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams