Tools based on artificial intelligence, which are capable of generating highly graphical images from plain text, are gaining popularity in recent times. One such highly powerful and efficient tool is Stable Diffusion. But Stable Diffusion has heavy and specific system requirements to be fulfilled in order to run on your device.
Stable Diffusion Web UI provides an easy alternative for people with outdated or incompatible devices.
Stable Diffusion web UI is a free and open-source browser interface based on the Gradio library for generating images using diffusion models.
This powerful text-to-image and image-to-image generator helps you generate high-quality images in a single click. It is also known as AUTOMATIC 1111 web UI or A1111. The interface is super intuitive and easy to use, with numerous features like outpainting, inpainting, color sketch, prompt matrix, upscale, etc.
This article covers the requirements, dependencies, and features available, how to use them, and step-by-step instructions on how to download and install Stable Diffusion Web UI on Windows, Linux, and Apple Silicon. Let’s start with understanding the dependencies for Stable Diffusion Web UI:
Table Of Contents 👉
- Some Required Dependencies For Stable Diffusion Web UI
- How To Download And Install Stable Diffusion Web UI?
- Features Of Stable Diffusion WebUI
- 1. Text2Img
- 2. Img2Img
- 3. Extras
- 4. PNG Info
- 5. Checkpoint Merger
- 6. Train
- 7. Settings
- 8. Extensions
- 9. Some More Features Of Stable Diffusion Web UI
Some Required Dependencies For Stable Diffusion Web UI
Dependencies for Python 3.10.6 and Git
- On Windows: download and run installers for Python 3.10.6 (webpage, exe, or win7 version) and git (webpage)
- On Linux (Debian-based): sudo apt install wget git python3 python3-venv
- On Linux (Red Hat-based): sudo dnf install wget git python3
- On Linux (Arch-based): sudo pacman -S wget git python3
There are two ways to obtain code from this repository:
1. Using git: git clone
The method using git is preferred because you can update by simply running git pull. Commands can be used from the command line window that opens after right-clicking in Explorer and selecting “Git Bash here”.
2. Using the “Code” (green button) -> “Download ZIP” option on the main page of the repo.
You are required to install git even if you choose this method. You will have to download zip again and replace files if you want to update it.
How To Download And Install Stable Diffusion Web UI?
Stable Diffusion Web UI is compatible with Windows, Mac, and Google Colab.
On Windows 10/11 with NVidia-GPUs using the release package
- First, download sd.webui.zip from v1.0.0-pre and extract its contents.
- Then run update.bat.
- Now run run.bat.
Automatic1111 Stable Diffusion Web UI Installation On Windows
System requirements on Windows:
- Windows 10 or higher
- A discrete Nvidia video card (GPU) with 4 GB VRAM or more
- Integrated GPU is not going to work on Windows.
If your system doesn’t meet the requirements, you can use the Cloud service – Google Colab or Mac Apple Silicon M1/M2.
How To Install Stable Diffusion WebUI On Windows?
1. Install Python version 3.10.6
Using versions above Python 3.11 is not recommended.
You can install Python either from the Microsoft Store or using the 64-bit Windows Installer available on the Python website.
It is easier to install from the Microsoft Store. Follow the steps below to install Stable Diffusion Web UI:
- Go to “Control Panel” and then click on “Add or remove programs.” Remove all the previously installed versions from your computer, if any.
- Go to this website and search for Python 3.10
- Now press the Windows key on your keyboard and type “cmd.”
- Open the “Command Prompt” app to open a terminal.
- Type “python” and press enter.
- It will respond with Python 3.10, indicating that Python has been installed successfully.
You cannot go to the next step until Python is successfully installed. If Python 3.10 is not running, then try restarting the PC, reinstalling Python, or installing from the Python website instead of the Microsoft Store.
2. Then install GIT.
GIT is required to install and update AUTOMATIC1111 as it is a code repository management system.
- Go to this website
- Now open the installer and click on “Install.”
3. Clone Web UI
- After pressing the Windows key on your keyboard, type “cmd” in the search box.
- Now click on the “Command Prompt” app.
- Type “cd %userprofile%” and press enter.
- The prompt will show “C:\Users\YOUR_USER_NAME>”
- Now enter the following command to clone the AUTOMATIC1111 repository.
- A folder named “stable-diffusion-webui” will appear in your home directory.
4. Then, download the model file.
- Open this recently created folder and go to File Explorer.
- Search for “%userprofile%\stable-diffusion-webui” and press enter.
- Click on the “Models” folder.
- Then open the “Stable Diffusion” folder.
- A file named “Put Stable Diffusion checkpoints here.txt” will appear.
- Now download the Stable Diffusion v1.5 model checkpoint file from this link
- Put it in the “Put Stable Diffusion checkpoints here.txt” folder.
5. Run the WEB UI
- Now again, navigate to the “stable-diffusion-webui” folder.
- Double-click on the file named “webui-user.bat” to run and complete the installation.
- After the installation, the message “Running on local URL: http://127.0.0.1:7860” will appear.
- Copy the URL and paste it into your web browser.
- The Stable Diffusion AUTOMATIC1111 webui will open up.
- Now you can use this window to generate images.
For closing Stable Diffusion Web UI simply close the cmd black window.
To reopen it, double-click on the file named “webui-user.bat”.
How To Install Stable Diffusion WebUI On Linux?
GPU Server Environment
- CPU: Dual 12-Core E5-2697v2
- GPU: Nvidia RTX 3060 Ti 8GB
- RAM: 128GB RAM
- 240GB SSD + 2TB SSD
- OS: Ubuntu 20.04 LTS, Kernel 5.4.0
The first step is to install the dependencies
sudo apt install wget git python3 python3-venv
- Red Hat-based:
sudo dnf install wget git python3
sudo pacman -S wget git python3
Type this command in the directory where you want the WebUI to be installed:
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)
To add a model to the AUTOMATIC1111 GUI, first, obtain the checkpoint (.ckpt) file by downloading it from the models/Stable-diffusion directory. You can find various models on platforms like HuggingFace or Civitai. Utilize the ‘wget’ command to acquire the desired model.
Now run webui.sh.
administrator@ubuntu:~/stable-diffusion/stable-diffusion-webui$ ./webui.sh –xformers –share
After running webui.sh, it will install all the dependencies. Then a link should pop up, for example, http://127.0.0.1:7860
How To Install Stable Diffusion WebUI On Mac?
- First, install homebrew from this website
- Now add Homebrew to your PATH by following the instructions given.
- Then open the terminal window and run: “brew install cmake protobuf rust [email protected] git wget”
- Now run git clone to clone the web UI repository: “git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui”
- Then add the models or checkpoints you want to use into “stable-diffusion-webui/models/Stable-diffusion.”
- Enter: “cd stable-diffusion-webui”
- Enter: ” ./webui.sh” to run the web UI.
- Now a Python virtual environment will be created and activated using venv.
- Missing dependencies will automatically download and install.
To repoen the web UI run “./webui.sh”
To update, run git pull before running ./webui.sh.
Features Of Stable Diffusion WebUI
Stable Diffusion WebUI generates very high-quality images with great speed as it is GPU-enabled. It is very safe as the platform protects your privacy.
They also state that they do not collect or use any personal information. They also don’t store the text or images. The images that are generated come under the CC0 1.0 Universal Public Domain Dedication. They are fully open source.
There are 8 tabs text2img, img2img, extras, PNG info, Checkpoint manager, Train, Settings, and Extensions.
Features available in each tab and their sub-features have been covered below.
As the name suggests, this tab contains all the options for converting the input text into an image. Here you can enter the prompt that you want along with a negative prompt.
Negative prompts allow you to specifically add the details or features of the object that you want to be excluded from the output.
In simple terms, if a “prompt” tells your AI what to include in your output, a “negative prompt” will indicate what NOT to include in the output.
For example, if a negative prompt of ‘blurry’ is given, then all the parts of the output image where the image looks blurry will be removed. A sharp and detailed image will be generated.
Various features available under the text2img tab are:
Basically, how stable diffusion works is that it first generates an image full of noise. This image is then gradually denoised to get the final output. Steps refer to the number of these denoising steps taken to reach the final output.
The quality of the output image is directly proportional to the number of steps. Therefore the quality increases with the number of steps. But this is true only to a certain limit. 25 steps is the default setting which is sufficient to generate a good image.
Earlier, people used 100-150 steps with samplers like LMS. Presently with the help of faster samplers like DDIM and DPM Solver ++, using these huge numbers of steps is not required. This only wastes your time and GPU power.
For faster results to test the prompt: 10-15 Steps
After you find the right prompt: 25 steps
While creating any image that has detailed texture like face, fur, animal, etc: 40 steps
A sampler is an algorithm that compares the generated image after every denoising step with the text prompt input.
Basically, diffusion samplers are used to compare if the output image generated matches the requirements of the input prompt.
The sampler will continuously check the generated image after every step, compare it, and make a few changes to the noise till a satisfactory output image is generated.
The different sampling methods available are Euler A, Euler, LMS, Heun, DPM2, DPM2 A, DPM Fast, DPM Adaptive, LMS Karras, DPM2 Karras, DPM2 A Karras, DDIM, and PLMS.
Euler A, DDIM, and DPM Solver++ are the most popularly used samplers which are super fast and generate high-quality results in 15-25 steps only.
As you can see from the different images generated using different samplers.
Euler A produces a dreamy look with smoother colors and less defined edges.
C. CFG Guidance Scale
CFG stands for classifier-free guidance, and this parameter basically tells the AI how much it should strictly follow the prompt or be creative in generating results.
It defines the amount of freedom for creativity that you give to the AI. It ranges from -1 to 30.
A low CFG indicates that the AI has more freedom to be creative, and a high CFG indicates that the AI should strictly follow the prompt given.
Recommended range of CFG 5-16. Start from 7 and increase the number gradually till you get a satisfactory result.
The default CFG is 7.
- -1 CFG: This should be used if you want to generate totally random results.
- 1 CFG: When you want your prompt to be ignored.
- CFG lower than 5 is usually not recommended as the results look like AI hallucinations.
- CFG higher than 16 is also not recommended, as it gives results with ugly artifacts.
- 2-6: Results might be creative but distorted, useful for short prompts only. The AI might not follow the prompt.
- 7-10: This is the recommended range as it balances creativity and sticking to the prompt.
- 10-15: You should use this range only when you have a detailed and clear prompt about your generated output.
- 16-20: This should be used when the prompt has very good details.
- >20: CFG higher than 20 is not at all usable.
Do you remember that stable diffusion works on generating a first image full of noise? Seed is the parameter that decides the amount of this random noise. This random noise itself determines the final result.
It is recommended to use seed -1 to explore.
Using the same seed and prompt will generate the same result even if you regenerate the response several times.
Using a different seed will generate a different image even if the same prompt is being run.
Using a different seed might look more productive on the surface, but the same seed and prompt combo can be used to control specific features of the image,
Seeds are useful in changing the style of the image and also in understanding which words affect the results more.
E. Restore Faces
Stable Diffusion usually generates images that have somewhat weird faces. This feature helps you to correct the faces (if any) in the image. ‘Restore faces’ helps to correct the minute details in the face to make it look better.
F. Hires Fix
This feature lets you create larger images without distorting them. By default, it is used at a 2X scale. It contains other options like Denoising strength and Upscaling.
Denoising strength basically decides how much the image will be modified. If set to 0, the image will not change much. Upscaling lets you upscale the image for free. Hires steps should be at least half the number of your sampling steps.
You can generate repeating patterns on a grid using tiling.
Img2img is used when you want to modify the existing image. This tab gives multiple options for converting images into better-detailed and modified images.
As we know, the first step in diffusion models is to start with an image full of noise. This noise is usually determined by the seed number.
But using the Img2img feature, you can provide a starting image to the AI as the first step instead of using the noise generated by the seed number. It works similar to txt2img.
A. Image Size
The recommended image size is 512 x 512, as the model was trained on images with these dimensions.
B. Batch Size
This number determines the number of output images generated every time. The recommended batch size is 4 or 8.
Using this, you can control the ratio of the output image and change it to whatever aspect ratio you want.
D. Denoising Strength
This number basically decides how much the image will be modified. If set to 0, the image will not change much.
This feature allows you to draw the rough base image instead of uploading it. This rough image will be considered as the image to modify.
Inpainting is the most used feature as it allows you to modify only certain portions that you want. If you have already generated an image and only want some little modifications, then click on the “Send to inpaint” option. Then select the area which you want to be modified using a paintbrush tool. Now you can adjust other parameters and hit the generate button to get only the modified part.
G. Interrogate Clip
You upload an image, and this feature tells you the prompt that would have generated the image.
Upscaling: You can upscale your image here for free by entering the “Scale by” or “Scale to” factor.
The types of upscaler available are:
- RealESRGAN (neural network upscaler)
- ESRGAN (neural network upscaler with a lot of third-party models)
- LDSR (Latent diffusion super-resolution upscaling)
- SwinIR and Swin2SR (neural network upscalers)
An upscaler 2 option is also available so that you can combine the effects of two different types of upscalers.
You can restore the faces using:
- GFPGAN (fixes faces)
- CodeFormer (face restoration)
4. PNG Info
The generation parameters that you use to generate an image are saved to the image png file so that these can quickly be recovered whenever required.
To restore these, you simply need to drag the image into the PNG Info tab. You can automatically copy them into the UI. This feature can be disabled.
5. Checkpoint Merger
As the name suggests, checkpoint merger allows you to combine 2 or 3 models. This is generally used if you want to merge the styles of two models. Although it is a very useful feature, sometimes the results generated might be unsatisfactory.
This tab is used for training models. It supports hypernetwork, texual inversion, and embedding too.
In textual inversions, you can have any number of embeddings with any name of your choice. You can also use multiple embeddings with different numbers of vectors for every token. They work with half-precision floating point numbers. The embeddings are trained on 8GB.
There are a lot of settings available that you can customize according to your requirements.
The Extensions tab contains Installed, Available, and Install from URL options. These provide additional functionality to the platform.
You can simply paste a link in the “Install from URL” section and click on Install. After the installation is complete, go to the “Installed ” section and then click on “Apply and restart UI”.
Some of the most useful extensions have been listed below:
A. Image Browser (updated)
This extension allows you to browse through previously generated images easily. It also sends their prompts and parameters back to txt2img and img2img.
TagComplete extension is a must-have for generating anime-related art. It shows the matching booru tags as you type. This is a very essential part, as anime models do not work efficiently without them.
It allows you to use LoCons and LoHas.
Using ControlNet, you can analyze any image and also use it as a reference for generating an image.
E. Ultimate Upscale
As its name goes, this can be used in the img2img section. It allows you to upscale the image as much as you want.
You can divide your image into different parts using the Two Shot extension. It is useful while creating an image with more than two characters as they do not mix.
9. Some More Features Of Stable Diffusion Web UI
- You can run the install and run script in a single click.
- A prompt matrix is available, which lets you generate images by merging the prompts.
- You can specify parts of the prompt that you want the model to focus more on. For example: A ((bluebird)) sitting on a branch. The model will focus more on “bluebird.”
- Using loopback, you can run img2img multiple times.
- You can draw three-dimensional plots using the X/Y/Z plot.
- The processing can be interrupted at any time.
- There is support for a 4GB video card.
- Live prompt token length validation is available.
- You can run arbitrary Python code from the User Interface (must run with –allow-code to enable)
- Most UI elements come up with mouseover hints.
- Using text config, you can change the default/mix/max/step values.
- Live image generation preview along with a progress bar.
- Using styles, you can skip writing long prompts and simply add them from the drop-down menu.
- Prompts can be edited mid-generation.
- Seed resizing allows the generating of the same image with different resolutions.
- Custom scripts with many extensions.
- Composable diffusion lets you use multiple prompts at the same time using “AND.”
- There is no token limit on prompts.
- Elements on the UI can be rearranged.
- The resolution restriction is multiple of 8 instead of 64.
- It supports Alt-Diffusion.
- It supports Stable Diffusion 2.0
- It supports inpainting models by RunwayML.
- The estimated time for generation is shown on the progress bar.
- You can add a different VAE from settings.
- A generate forever option is also available.
- The history tab extension allows you to view, direct, and delete images easily.
- Aesthetic Gradients is available through extensions which allows you to generate images with a particular aesthetic using clips images embeds.
That’s it guys. Hope you find this detailed Stable Diffusion Web UI guide helpful.