# Complete guide to samplers in Stable Diffusion

Dive into the world of Stable Diffusion samplers and unlock the potential of image generation.

## Introduction

As we saw in the article How Stable Diffusion works, when we ask Stable Diffusion to generate an image the first thing it does is generate an image with noise and then the sampling process removes noise through a series of steps that we have specified. It would be something like **starting with a block of white marble** and hammering it for several days **until you get Michelangelo’s David**.

Several algorithms come into play in this process. The one known as **sampler** is in charge of obtaining a sample from the model that we are using in Stable Diffusion on which the noise estimated by the noise predictor is applied. It then subtracts this sample from the image it is cleaning, **polishing the marble in each step**.

This algorithm handles the **how**, while the algorithm known as the **noise scheduler** handles the **how much**.

If the noise reduction were linear, our image would change the same amount in each step, producing abrupt changes. A negatively sloped noise scheduler can remove large amounts of noise initially for faster progress, and then move on to less noise removal to

fine-tune small detailsin the image.

Following the marble analogy, in the beginning it will probably be more useful to give it good hits and **remove large chunks to advance quickly**, while towards the end we will have to **go very slowly to fine tune the details** and not make an arm fall off.

A key aspect of the process is **convergence**. When a sampling algorithm reaches a point where more steps will not improve the result, the image is said to have converged.

Some algorithms converge very quickly and are **ideal for testing ideas**. Others take longer or require a greater number of steps but usually offer **more quality**. Others never do because they have no limit and offer more **creativity**.

With this article you will understand the nomenclature as well as the uses of the different methods without going into too much technical detail.

The image used in the demonstrations has been generated with the following parameters:

**Checkpoint**:`dreamshaper_631BakedVae.safetensors`

.**Positive prompt**:`ultra realistic 8k cg, picture-perfect black sports car, desert wasteland road, car drifting, tires churns up the dry earth beneath creating a magnificent sand dust cloud that billows outwards, nuclear mushroom cloud in the background far away, sunset, masterpiece, professional artwork, ultra high resolution, cinematic lighting, cinematic bloom, natural light`

.**Negative prompt**:`paintings, cartoon, anime, sketches, lowres, sun`

.**Width/Height**:`512`

/`512`

.**CFG Scale**:`7`

.**Seed**:`1954306091`

.

## Samplers available

Depending on the software used you will find a varied list of possibilities. In this case we are going to analyze the samplers available in Automatic1111.

It is difficult to classify them into groups, although there are clearly two main approaches:

**Probabilistic models**such as`DDPM`

,`DDIM`

,`PLMS`

and the`DPM`

family of models. These generative models are able to generate an output based on the probability distribution estimated by the model. It would be like using a**camera to photograph a landscape**.**Numerical approach methods**such as`Euler`

,`Heun`

and`LMS`

. In each step, the solution to a particular mathematical problem is sought and the solution is estimated bit by bit. In this case it would be like using**painting and a canvas to create the landscape**and adding new details in each step.

### DDPM

`DDPM`

(paper) (Denoising Diffusion Probabilistic Models) is one of the first samplers available in Stable Diffusion. It is based on explicit probabilistic models to remove noise from an image. It requires a large number of steps to achieve a decent result.

It is no longer available in Automatic1111.

### DDIM

`DDIM`

(paper) (Denoising Diffusion Implicit Models) works similarly to `DDPM`

, using in this case implicit probabilistic models. This difference produces better results in a much smaller number of steps, making it a faster sampler with little loss of quality.

As can be seen in the cloud, better results are obtained with a high number of steps (100+). There are better alternatives as we will see below.

### PLMS

`PLMS`

(paper) (Pseudo Linear Multi-Step) is an improvement over `DDIM`

. Using a 50-step process it is possible to achieve higher quality than a 1000-step process in `DDIM`

. Fascinating, isn’t it? Well, read on, this is nothing.

In the case of `PLMS`

we cannot use few steps because it is not able to clean the noise, but between 50 and 100 steps it is already able to provide good results.

### Euler

`Euler`

is possibly the simplest method. Based on ordinary differential equations (ODE), this numerical method eliminates noise linearly in each step. Due to its simplicity it may not be as accurate as we would like but it is one of the fastest.

`Euler`

is so fast that it is able to deliver good results even in 10 steps. Its strong point is between 30 and 50 steps.

### Heun

`Heun`

is the perfectionist brother of `Euler`

. While `Euler`

only performs a linear approximation, `Heun`

performs two tasks in each step, making it a second-order sampler. It first uses a linear approximation for prediction and then a nonlinear approximation for correction. This improvement in accuracy offers higher quality in exchange for a small drawback: it takes twice as long.

Karl Heun developed this numerical method more than a century ago!

At 10 steps it still has some noise but it disappears in few more. As you can see, it offers high quality in 30 steps, although in 50 it offers a little more level of detail. In 100 steps it hardly changes the image and it is not worth getting old waiting for the result.

### LMS

`LMS`

or Linear Multi-Step method is the cousin of `PLMS`

that uses a numerical rather than a probabilistic approach (`PLMS`

- `P`

= `LMS`

).

Moreover, unlike `Euler`

and `Heun`

, it uses information from previous steps to reduce noise in each new step. It offers better accuracy in exchange for higher computational requirements (slower).

Using few steps we have a sampler capable of generating psychedelic images imitating the effect of drugs. Jokes aside, it is a sampler that is not worth it because despite being fast it needs around 100 steps to offer something decent.

### Family of DPM models

`DPM`

(Diffusion Probabilistic Models) are probabilistic models that offers improvements over `DDPM`

. Hence the similar name. There is also no implementation available in Automatic1111 because it has improved versions as we will see below.

`DPM2`

is an improvement over `DPM`

. You could say that it is version 2.

With 10 steps you already get an impressive quality (don’t try 5 steps, you won’t like the result). Around 30 to 50 steps is the ideal point. More steps are usually not worth it.

On the other hand we have `DPM++`

, which is also an improvement of `DPM`

. It uses a hybrid approach combining deterministic and probabilistic methods for sampling and subsequent noise reduction. There is no basic implementation of this sampler in Automatic1111, but it is combined with other methods. We will see this in the next section.

Thus, two versions with improvements were born from `DPM`

: `DPM2`

and `DPM++`

.

#### Faster DPM models (`DPM-Solver`

and `UniPC`

)

Diffusion Probabilistic Models (`DPM`

) are, as the name suggests, **probabilistic**. In each step, **equations are not solved by deterministic numerical methods** as in the case of `Euler`

, `Heun`

or `LMS`

, but the problem is approached by approximation to try to sample as accurately as possible.

Within these models there is a piece called **solver**, an algorithm that has an important role in calculating and approximating a probability distribution in sampling. And this is where a new technique known as `DPM-Solver`

is implemented that shortens the duration of each step.

In other words, models like `DPM fast`

(paper) or `DPM++ 2S`

/`DPM++ 2M`

(paper) implement a faster solver that will save time in the sampling process.

It will be fast (and not that fast either), but using few steps it is unusable. Interestingly it offers a different result to the rest of samplers and it seems that the cinematic effect is more pronounced.

In the case of `DPM++ 2S`

/`DPM++ 2M`

the number `2`

means that they are second order. That is, they use both a **predictor** and a **corrector** to approximate the result accurately.

The `S`

stands for `Single step`

. A **single calculation** is performed in each step, so it is faster.

In contrast, the letter `M`

stands for `Multi step`

, an approach in which **multiple calculations** are performed in each step, taking into account information obtained in previous steps. This equates to more accurate and higher quality convergence at the cost of taking longer.

In both modalities this solver is faster than the default `DPM`

model solver.

There is no Automatic1111 implementation of `DPM++ 2S`

, only with `A`

, `Karras`

and `SDE`

variants (more on that later). So let’s see some samples of `DPM++ 2M`

.

Little to say about this all-rounder sampler. It offers impressive results in 30 steps and if you give it some more time it can be squeezed even more.

As for `UniPC`

(paper), it is a solver that consists of two parts: a unified predictor (`UniP`

) and a unified corrector (`UniC`

). This method can be applied to any `DPM`

model and focuses on delivering the **maximum possible sampling quality in the least amount of steps**. Remember now when `PLMS`

brought down to 50 steps what `DDIM`

did in 1000? Well, in some cases `UniPC`

is able to generate quality images in as few as 5 or 10 steps.

Thus, `UniPC`

can be integrated in `DPM`

models both `Single step`

and `Multi step`

, making it comparable to `DPM++ 2S`

or `DPM++ 2M`

, with the particularity of offering better results when the number of steps is very low.

Even the `UniC`

corrector can be integrated into these sampling algorithms to achieve higher efficiency (e.g. `DPM++ 2S`

+ `UniC`

).

In this example 10 steps is not enough to generate an image without noise, but in 15 or 20 you will get it. In 30 steps it is magnificent and there is no need to go any further, although there is still some room for improvement.

#### More accurate DPM models (`Adaptive`

)

The `DPM adaptive`

model is an extension of the `DPM`

model in which it **adapts the step size** according to the difficulty of the problem is trying to solve.

In other words, it is as if the specified number of steps is ignored and the algorithm is free to sample more efficiently until the best possible convergence is achieved. It generates higher quality images at the expense of taking as long as it needs to (it is the slowest sampler).

In this case it has taken triple or quadruple the time with respect to other samplers but the result is amazing. The image composition is different from all samplers and is more like `DPM fast`

.

## Other features

Only one sampling algorithm can be chosen. Either `Euler`

or `DPM`

can be used, but not both at the same time. Instead, when we talk about variants or extra features, these can be combined.

For example, we can use the sampler named `DPM2 A Karras`

. Let’s see what these new values mean.

### Ancestral variants

When a sampler contains the letter `A`

, it usually means that it belongs to the category of **ancestral** variants. These variants add, in each new step, random variables obtained from previous steps. It is as if after cleaning up the noise in one step, some of the previous noise is added back.

Samplers with this feature **never converge because of the random noise added in each step**. If there is always noise to remove, you can always go one step further.

This makes them more creative samplers. An extra step does not necessarily increase the quality, but if gives another similar result.

If you try to reproduce an image generated with Stable Diffusion and you don’t succeed even though you are using the same seed and the same parameters, it may be because you are using an ancestral sampler. This is normal! The noise that is re-added in each step is random and different implementations or versions of the sampler will almost certainly generate different results.

Some examples of samplers are `Euler A`

, `DPM2 A`

or `DPM++ 2S A`

.

`Euler A`

gives a great result in 25-30 steps being also very fast. In 50 steps the quality is worse and then in 100 steps it is better again. It is a lottery. Moreover, you can see how the image composition is constantly changing due to the random noise introduced in each step. Far from being a drawback it is perhaps its greatest advantage.

### Karras Variants

Variants with the word `Karras`

(or `K`

) (paper) refer to work led by Nvidia engineer Tero Karras. This process introduces a series of improvements in some samplers, achieving **improved efficiency** in both the quality of the output and the computation required for sampling.

Some samplers using these changes are: `LMS Karras`

, `DPM2 Karras`

, `DPM2 A Karras`

, `DPM++ 2S A Karras`

, `DPM++ 2M Karras`

or `DPM++ SDE Karras`

.

Like `DPM++ 2M`

, this sampler offers very good results between 30 and 50 steps, but the `Karras`

version has the advantage of offering better results in a reduced number of steps as can be seen in the following example:

If you use a high number of steps you will have a hard time seeing the difference.

### Stochastic variants

The `SDE`

(paper) variants use stochastic differential equations. Without going into further detail, using this type of differential equations allows to model the noise in a more sophisticated and accurate way, using information from previous steps, which in principle would generate **higher quality images in exchange for being slower**. Being stochastic, it never converges, so the higher the number of steps they do not offer higher quality, but rather more variations, just like ancestral samplers.

At the date of publication of this article we have `DPM++ SDE`

, `DPM++ 2M SDE`

, `DPM++ SDE Karras`

and `DPM++ 2M SDE Karras`

.

Stochastic samplers are slow but offer incredible results even with 10 steps. Their results are also more varied and creative. As they never converge they are an alternative to ancestral samplers.

## What is the best sampler in Stable Diffusion?

Is a Ferrari or a Jeep better? Well it depends on whether you’re going off-road, doesn’t it?

Depending on what you need it’s better to use one type of sampler or another. With the above information I hope it will be easy to choose, but here are some hints.

### Image quality

If you are looking for quality it is a good idea to pursue convergence. That’s the point at which you get the highest quality. If you don’t want to sacrifice too much generation speed, forget about samplers like `DDIM`

that need hundreds of steps to converge. `Heun`

and `LMS Karras`

offer good results but it is better to use `DPM++ 2M`

or its `Karras`

version.

You can also try `DPM adaptive`

if you are not in a hurry, or `UniPC`

if you are.

With these samplers mentioned above you will get good results in 20-30 steps although it doesn’t hurt to try a few extra steps.

### Generation speed

If you are testing prompts you don’t want to spend so much time waiting for results. In this case and where you are not looking for maximum quality but to test changes quickly I recommend using `DPM++ 2M`

or `UniPC`

with a small number of steps.

With just 10-15 steps you will get a very decent image.

If you don’t care about reproducibility you also have `Euler A`

, a fast and good quality ancestral sampler. My favorite sampler!

### Creativity and flexibility

This section is reserved exclusively for ancestral and stochastic samplers. They don’t offer bad quality nor are they slow, they are just different.

The problem or advantage (depending on how you look at it) of these samplers is that if you have an image generated in 40 steps, having done it in 50 steps can make the image better or worse. You will have to test continuously. And this lottery makes these samplers more creative since you can always change the number of steps to obtain small variations.

Of course here `Euler A`

and `DPM++ SDE Karras`

stand out. Try generating images in 15 steps, 20 steps, 25 steps… and see how the result changes.