@outliier

Since these videos take an enormous amount of time (this one took about 300 hours), would you like to see, additionally, paper explanations in the style of Yannic Kilcher (https://www.youtube.com/@YannicKilcher) ? I could cover papers very quickly after they are released and also cover topics I wouldn’t do an animated video for. Let me know what you think :)

@olivercaufield669

Best intro to Diffusion ever.
I bet this video is better than 80% of the introductory courses in college.

@amimaster

Wow, finally someone that makes evident EVERY math step, as simple as it might. In my career I rarely found someone that can make math so accessible, I often get lost after the first one or two "implied" steps.

@kirill2848

For people wondering why it's ok to use the first order the taylor expansion at 32:26

The full expansion will be: 1− (1/2) β_t  - (1/8) (β_t)^2 ....
So if β_t is small then all the future terms are much smaller and become almost 0. The range from the original DDPM paper was 10^-4 to 0.02, so the first order approximation dominates

@李佳礼

I cant believe anyone will use dislike for this video. This is an excellent one

@huytruonguic

love your mathematics explanation and visualization, no fancy transitions were needed, just slow, simple, and clear english phrases

@outliier

32:38 To correct myself here, the paper gives explanation how to derive the sampler. I personally just find that approach much harder to understand and generally the papers don’t go into too much details for their derivations.

@pavanpreetgandhi6763

This video was absolutely fantastic—I feel like I’ve finally learned about diffusion models the right way! I really appreciated how you started from the basics, gradually building up concepts and intuition, while clearly explaining the math at every step. It took me a few hours to get through the entire video, but the length and pace were perfect—there’s nothing I would change. Everything was covered so thoroughly. Thank you for the effort you put into this, and I’m excited to see more videos from you in the future!

@Cyan-g2g

Wow! I did not expect this video to go this deep. But this is awesome! Please make more in depth explanation like this. It’s clear a lot of hard work went into it and the animation is sooo elegant

@angtrinh6495

Beautiful and clear explanation :medal-yellow-first-red::medal-yellow-first-red::medal-yellow-first-red:. Thank you, sir!

@UmbrabbitMagnolia

I have watched this video for three times,  may watch this video again. Thank you.

@boydkane5469

Had an epiphany watching you explain so many things that I never fully grilled, thank you so much

@venkatbalachandra5965

I absolutely love how you started from scratch, as in what the underlying PDF was. I'm working on a project on diffusion models and I don't know anything about it, and all the resources available are catered towards those with prerequisites I don't have yet, until this one. I haven't yet watched the whole thing, but I'm going to keep coming back to this till I understand everything in this video. Cheers mate!

@arpanpoudel

I used Score-SDE in my thesis and I have my defense next week :D what a timing

@jimlbeaver

It was like a lifetime of mathematics in 38 minutes!  You clearly having understanding of it in a talent for explaining it. I’d love to see you keep going on the diffusion stuff. Seems like a really important technology and the more people that understand it and use it the better.

Thanks great job.

@novantha1

Your videos are somehow simultaneously timely and timeless. Your content is absolutely appreciated and I wish you the best in your endeavors.

@mrp8686

Hey, I really enjoyed the video and learned a lot! One small comment: at 8:10 you state that we aim to minimize the both summands and, therefore, we want the gradient of s to be 0 at data points. From my understanding, we want to maximize p(x) at data points x. Therefore, the gradient of p(x) should be 0 at x (which minimizes the first summand) and the gradient of s (which corresponds to the Hessian of log(p)) should be negative actually.

@tell2rain

7:35 i have a question, the second line -Ep(x)[\nabla_x s_theta(x)] = -\int p(x) \nabla_x s_theta(x) dx, but you wrote a positive sign?

@seismicoracle

Kudos for taking the mathematical route generally avoided by most online courses and proponents. If you could incorporate a few more simple graphs and figures to provide a complementary visual explanation of the math (a lot of work, I know) this will indeed be an outstanding resource!

@InturnetHaetMachine

Regarding your pinned comment. No offense to Yannic, but your explanations are 10x better. The topics you've covered you actually understand, you explain not only what is going on, but also why. That, and you going into mathematical explanations are really appreciated. Don't worry about the quantity, it's easy to read a paper, and put surface level explanations out for more views, what you're doing is more valuable. Your videos are a treasure for amateur Deep Learning hobbyists like me who want to dig deeper into this field.