Flow matching is a powerful generative model that transforms a simple probability distribution into a complex target distribution. Unlike other generative models, flow matching learns a continuous-time vector field that guides samples from a source distribution, typically a standard Gaussian, toward the target data distribution. This approach enables efficient sampling through ordinary differential equations and offers advantages in training stability. Flow matching is related to diffusion models, but focuses on directly learning the vector field rather than score functions or denoising processes.
The core of flow matching is the vector field, which defines how particles move through space over time. This vector field, denoted as v(t, x), specifies the direction and speed of movement at each position x and time t. Time typically ranges from 0 to 1, where 0 represents the source distribution and 1 represents the target distribution. The vector field is learned using neural networks that predict the optimal direction to transform samples from the source distribution into samples from the target distribution. As particles follow this vector field, they trace paths from the source to the target, effectively learning the transformation between distributions.
The learning objective in flow matching is elegantly simple. Instead of directly modeling complex probability distributions, we train a neural network to predict a vector field that matches a target vector field called the Conditional Probability Flow, or CPF. The loss function minimizes the squared difference between the predicted vector field v_theta and the target CPF vector field. The CPF typically uses straight-line paths between paired points from the source and target distributions. This approach is computationally efficient because it doesn't require calculating probability densities or their gradients. As training progresses, the learned vector field gradually aligns with the target vector field, enabling accurate transformation between distributions.
Once we've trained a flow matching model, sampling from it is straightforward. First, we sample a point x₀ from the source distribution, typically a standard Gaussian. Then, we solve an ordinary differential equation defined by our learned vector field. The ODE is given by dx/dt equals v_theta of t and x(t), with the initial condition x(0) equals x₀. To solve this ODE, we can use numerical methods like Euler integration or more sophisticated Runge-Kutta methods. As we integrate from time t=0 to t=1, our sample follows the vector field and transforms from the source distribution to the target distribution. This deterministic sampling process is one of the key advantages of flow matching models, as it allows for efficient generation without requiring multiple network evaluations like diffusion models.
Let's compare flow matching with other generative models. Flow matching offers several advantages: it has a simple training objective that directly learns a vector field, enables efficient deterministic sampling through ODEs, doesn't require explicit density estimation, and exhibits stable training dynamics. Compared to diffusion models, flow matching uses deterministic ODEs instead of stochastic SDEs, potentially allowing for faster sampling. Unlike normalizing flows, flow matching doesn't require invertible transformations, giving it more modeling flexibility. And compared to GANs, flow matching has more stable training without adversarial dynamics. While each model type has its strengths, flow matching represents an elegant balance of training simplicity, sampling efficiency, and model stability, making it a promising approach for many generative modeling tasks.