Step-by-Step Guide to building a Stable Diffusion Upscaler using ComfyUI

November 9, 2024

As AI tools continue to improve, image upscalers have become a necessary aid for anyone working with images. This tutorial will guide you through how to build your own state of the art image upscaler using Stable Diffusion inside ComfyUI.

In addition to being a lot cheaper to run, this approach offers some key advantages compared to other online options. Mainly, it offers complete control over the workflow, allowing you to tailor the upscaler to your preferred styles and settings. For example, you can use your own checkpoints and LoRAs to steer the upscaler towards specific aesthetics, or add controlnets to the process to increase the level of detail preserved from the starting image.

Before we begin, make sure you have the Ultimate SD Upscale node pack installed. You can install it directly via the ComfyUI manager.

You will also need to download the necessary models to run the workflow:

You can add the upscaler model to the “ComfyUI/models/upscale_models” folder.
Any SDXL checkpoint to the “ComfyUI/models/checkpoints” folder. In this case, we use Juggernaut XL
And finally, add the add detail LoRA for SDXL to “ComfyUI/models/loras”

You can then load the workflow by dropping this image directly inside your ComfyUI and start using it right away.

For those of you who are not familiar with ComfyUI, the number of parameters available might seem scary at first. Thankfully, you only need to know how to use a few of them to get the most out of this workflow. I will go through the key ones here:

Load Checkpoint

This is the Stable Diffusion/SDXL model that will perform the upscaling process. We use Juggernault but you can use any SDXL model.

Load LoRA

We use the add detail LoRA to create new details during the generation process. Although we suggest keeping this one to get the best results, you can use any SDXL LoRA. You can also add multiple LoRAs by adding another Load LoRA node if you are aiming for something specific.

The strength_model parameter dictates how impactful the LoRA will be. If you set it at 0 the LoRA will be ignored completely, on the other hand, if you set it at 2 the LoRA will be at it’s strongest.

The CLIP Text Encoder (Prompt)

These two nodes are the positive and the negative prompts. Although, they are not as important for upscaling as for generating images from scratch, having a positive prompt that describes the images can something help get better results.

Load Image

As the name suggests, this is where you upload the image you want to upscale.

Ultimate SD Upscale

This node is where the magic happens and where your image will be uspcaled. The only parameters we suggest modifying are the upscale_by and the denoise.

upscale_by is simply by how much you want to increase the size of your image. It works well with values between 2 and 4, anything higher and the results can start to get a little weird.

The denoise dictates how much creative freedom you want to give the SDXL model and can take any value between 0 and 1. We usually suggest staying between 0.3 and 0.6 for better results.

This workflow works on GPUs with 8GB of VRAM or higher. If you don’t have the right hardware or you want to skip the installation process, you can upload your own upscaler workflow to ViewComfy cloud and access it via a ViewComfy app, an API or the ComfyUI interface.