Reverse Time Stochastic Differential Equations for Generative Modelling
Created: March, 23, 2023
Originally posted at Reverse Time Stochastic Differential Equations for generative modelling.
What follows is a derivation of the main result of ‘Reverse-Time Diffusion Equation Models’ by Brian D.O. Anderson (1982). Earlier on this blog we learned that a stochastic differential equation of the form
with the derivative of Wiener process
The Kolmogorov forward equation is identical to the Fokker Planck equation and states
It describes the evolution of a probability distribution
The Kolmogorov backward equation for
and it basically answers the question how the probability of
Taking inspiration from our crude example earlier, the backward equation offers a partial differential equation which we can solve backward in time, which would correspond to evolving the arbitrarily complex distribution backwards to our original Normal distribution. Unfortunately there is no corresponding stochastic differential equation with a drift and diffusion term that describes the evolution of a random variable backwards through time in terms of a stochastic differential equation.
This is where the remarkable result from Anderson (1982) comes into play.
The granddaddy of all probabilistic equations, Bayes theorem, tells us that a joint distribution can be factorized by conditioning:
into which we can plug in the Kolmogorov forward (KFE) and Kolmogorov backward (KBE) equations,
The derivative occuring in the backward Kolmogorov equation are
The next step is to evaluate the derivative of the products in the forward Kolmogorov equation.
Substituting the derivatives of the probability distributions accordingly we obtain
In order to transform the partial differential equation above into a form from which we can deduce an equivalent stochastic differential equation, we match the terms of the second order derivatives with the following identity,
by observing that the terms (1) and (2) occur in both equations. We can see from the expansion of the derivative above that we can combine the terms in our derivation if we expand the “center term”. Furthermore we can employ the identity
the result of which is in the form of a Kolmogorov forward equation, although using the joint probability distribution
and introduce the time reversal
which finally gives us a stochastic differential equation analogous to the Fokker-Planck/forward Kolmogorov equation that we can solve backward in time:
where
By keeping the