Technology Encyclopedia Home >Deploy Stable Diffusion WebUI on a VPS — Generate AI Images on Your Own Cloud Server

Deploy Stable Diffusion WebUI on a VPS — Generate AI Images on Your Own Cloud Server

Stable Diffusion is one of the most capable open-source image generation models available. Most people run it on a local machine with a GPU, but there's a good case for running it on a cloud server: your desktop doesn't need to be on, you can access it from any device, and cloud servers with GPUs are available at reasonable hourly rates.

This guide covers deploying AUTOMATIC1111's Stable Diffusion WebUI on a cloud server, making it accessible via a browser interface, and keeping it running as a persistent service.

Important hardware note: Stable Diffusion can run on CPU-only servers, but it's very slow — 5–15 minutes per image on a typical VPS. For practical use, a GPU-equipped instance is strongly recommended. Tencent Cloud Lighthouse has GPU instance options in select regions. Even better: Lighthouse offers a TencentOS AI application image that comes pre-installed with Python 3, Docker, Git, PyTorch, TensorFlow, PaddlePaddle, and GPU drivers — you skip the hours-long CUDA + driver setup and go straight to installing Stable Diffusion WebUI. Select the TencentOS AI image when creating a GPU instance, and the environment is ready to use immediately.


Table of Contents

  1. CPU vs GPU — What to Expect
  2. What You Need
  3. Part 1: Install Dependencies
  4. Part 2: Install Stable Diffusion WebUI
  5. Part 3: Run WebUI and Access Remotely
  6. Part 4: Set Up as a Persistent Service
  7. Part 5: Nginx with HTTPS and Authentication
  8. Part 6: Download Additional Models
  9. The Thing That Tripped Me Up
  10. Troubleshooting
  11. Summary

  • Key Takeaways
  • Use the appropriate Lighthouse application image to skip manual installation steps where available
  • Lighthouse snapshots provide one-click full-server backup before major changes
  • OrcaTerm browser terminal lets you manage the server from any device
  • CBS cloud disk expansion handles growing storage needs without server migration
  • Console-level firewall + UFW = two independent protection layers

CPU vs GPU — What to Expect {#hardware}

Server Type Generation Speed Practical Use
CPU-only (4 vCPU) 5–15 minutes/image Testing, setup verification
GPU 4–8 GB VRAM 15–45 seconds/image Good for regular use
GPU 16+ GB VRAM 3–10 seconds/image High-throughput image generation

For the best experience, use a server with at least 4 GB VRAM. If you're just testing the setup, CPU mode works fine.


What You Need {#prerequisites}

Requirement CPU Mode GPU Mode
RAM 8 GB minimum 16 GB recommended
Storage 20 GB+ (models are large) 30 GB+
OS Ubuntu 22.04 Ubuntu 22.04
Python 3.10+ 3.10+
CUDA (GPU only) Not needed CUDA 11.8 or 12.x

Part 1: Install Dependencies {#part-1}

1.1 — Update System

sudo apt update && sudo apt upgrade -y

1.2 — Install Python 3.10+

sudo apt install -y python3 python3-pip python3-venv python3-dev
python3 --version
# Should be 3.10 or higher

1.3 — Install Build Tools

sudo apt install -y build-essential git wget curl libgl1 libglib2.0-0 libsm6 libxrender1 libxext6

1.4 — (GPU Only) Install NVIDIA Drivers and CUDA

⚡ Major shortcut: If you select the TencentOS AI application image when creating your Lighthouse GPU instance, NVIDIA drivers, CUDA, Python 3, PyTorch, and TensorFlow are already installed and configured. You can skip this entire section (1.4) and jump directly to Part 2. This is strongly recommended for GPU setups — CUDA driver installation is the most error-prone part of the process.

If you chose a plain Ubuntu system image, install drivers manually:

Check if your server has an NVIDIA GPU:

lspci | grep -i nvidia

If yes, install drivers:

# Check available NVIDIA driver versions
sudo apt search nvidia-driver | grep "nvidia-driver-[0-9]"

# Install recommended driver (adjust version number)
sudo apt install -y nvidia-driver-535

# Reboot to load driver
sudo reboot

After reboot, verify GPU is detected:

nvidia-smi

Install CUDA toolkit (for torch GPU support):

# Install CUDA 11.8
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt update
sudo apt install -y cuda-11-8

Part 2: Install Stable Diffusion WebUI {#part-2}

2.1 — Clone the Repository

cd /opt
sudo git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
sudo chown -R ubuntu:ubuntu stable-diffusion-webui
cd stable-diffusion-webui

2.2 — Download a Base Model

The WebUI needs at least one Stable Diffusion model checkpoint to generate images. Download from Hugging Face or CivitAI.

Create the models directory:

mkdir -p /opt/stable-diffusion-webui/models/Stable-diffusion
cd /opt/stable-diffusion-webui/models/Stable-diffusion

Download a model (example — v1-5 base model, ~4 GB):

wget -c "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt" \
  -O v1-5-pruned-emaonly.ckpt

Or for SDXL (better quality, requires more VRAM/RAM):

wget -c "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors" \
  -O sd_xl_base_1.0.safetensors

2.3 — Run the WebUI for the First Time

The first run installs Python dependencies automatically (takes 10–15 minutes):

For CPU-only:

cd /opt/stable-diffusion-webui
./webui.sh --no-half --skip-torch-cuda-test --listen --port 7860

For GPU:

cd /opt/stable-diffusion-webui
./webui.sh --listen --port 7860

--listen makes WebUI accessible on all interfaces (needed for remote access). --no-half disables half-precision math, required for CPU-only mode.

Wait for: Running on local URL: http://0.0.0.0:7860


Part 3: Run WebUI and Access Remotely {#part-3}

3.1 — Access via SSH Tunnel (Secure)

On your local machine:

ssh -L 7860:localhost:7860 ubuntu@YOUR_SERVER_IP

Open http://localhost:7860 in your browser.

You'll see the Stable Diffusion WebUI interface. Type a prompt in the text box and click Generate.

3.2 — Test Your First Generation

Enter a prompt like:

A serene mountain lake at sunset, photorealistic, highly detailed, 8k

Negative prompt:

blurry, low quality, distorted, ugly

Click Generate. On CPU mode, this takes several minutes. On GPU, 15–45 seconds.


Part 4: Set Up as a Persistent Service {#part-4}

Keep WebUI running after you close the terminal.

Create a startup script:

nano /opt/stable-diffusion-webui/start.sh

CPU version:

#!/bin/bash
cd /opt/stable-diffusion-webui
./webui.sh \
  --no-half \
  --skip-torch-cuda-test \
  --listen \
  --port 7860 \
  --api \
  --enable-insecure-extension-access

GPU version:

#!/bin/bash
cd /opt/stable-diffusion-webui
./webui.sh \
  --listen \
  --port 7860 \
  --api \
  --xformers
chmod +x /opt/stable-diffusion-webui/start.sh

Create a systemd service:

sudo nano /etc/systemd/system/sdwebui.service
[Unit]
Description=Stable Diffusion WebUI
After=network.target

[Service]
Type=simple
User=ubuntu
WorkingDirectory=/opt/stable-diffusion-webui
ExecStart=/opt/stable-diffusion-webui/start.sh
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable sdwebui
sudo systemctl start sdwebui
sudo systemctl status sdwebui

Follow logs:

sudo journalctl -u sdwebui -f

Part 5: Nginx with HTTPS and Authentication {#part-5}

Add HTTP Basic Auth to prevent unauthorized access.

5.1 — Install Nginx and Create Password File

sudo apt install -y nginx certbot python3-certbot-nginx apache2-utils

sudo htpasswd -c /etc/nginx/.htpasswd yourusername
# Enter a password when prompted

5.2 — Create Nginx Config

sudo nano /etc/nginx/sites-available/sdwebui
server {
    listen 80;
    server_name sd.yourdomain.com;

    auth_basic "Stable Diffusion";
    auth_basic_user_file /etc/nginx/.htpasswd;

    location / {
        proxy_pass http://localhost:7860;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        
        # Long timeout for generation requests
        proxy_read_timeout 600s;
        proxy_send_timeout 600s;
        
        # Large body for image uploads
        client_max_body_size 100m;
    }
}
sudo ln -s /etc/nginx/sites-available/sdwebui /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d sd.yourdomain.com

Part 6: Download Additional Models {#part-6}

SD 1.5 Variants

Good community fine-tuned models for specific styles:

cd /opt/stable-diffusion-webui/models/Stable-diffusion

# Realistic Vision — photorealistic portraits
wget -c "https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE/resolve/main/Realistic_Vision_V6.0_NV_B1.safetensors"

# DreamShaper — versatile artistic model
wget -c "https://civitai.com/api/download/models/128713" -O dreamshaper_8.safetensors

VAE Models (Improve Color/Sharpness)

mkdir -p /opt/stable-diffusion-webui/models/VAE
cd /opt/stable-diffusion-webui/models/VAE

wget -c "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors"

Activate the VAE in WebUI: Settings → Stable Diffusion → SD VAE.

LoRA Models (Style Modifiers)

mkdir -p /opt/stable-diffusion-webui/models/Lora

Download LoRA files (.safetensors) from CivitAI and place them here. Reference in prompts with <lora:modelname:weight>.


The Thing That Tripped Me Up {#gotcha}

On my first CPU-only run, the WebUI loaded but every generation failed with a CUDA out-of-memory error — strange, because I was running on CPU with no GPU.

The issue: PyTorch was installed with CUDA support (default), and even though no CUDA device was available, it was still trying to use CUDA ops. The --no-half flag wasn't enough.

The fix: Add --precision full and --no-half together:

./webui.sh --no-half --precision full --skip-torch-cuda-test --listen --port 7860

Also, if you see memory errors on CPU generation, lower the image resolution:

  • Start with 512×512 (SD 1.5) or 768×768 (SDXL base)
  • Higher resolutions require more RAM

For CPU-only servers, keep images at 512×512 and use SD 1.5 models (not SDXL) — SDXL's base resolution is 1024×1024 and uses too much RAM without GPU acceleration.


Troubleshooting {#troubleshooting}

Issue Likely Cause Fix
Very slow generation CPU mode Expected; GPU greatly improves speed
CUDA errors on CPU server Wrong PyTorch build Add --skip-torch-cuda-test --no-half --precision full
Out of memory Model too large Use 512×512 resolution; use pruned model files
WebUI not loading Service not started sudo systemctl start sdwebui and check logs
Model not appearing in dropdown Wrong directory Check model is in models/Stable-diffusion/
502 from Nginx during generation Proxy timeout Increase proxy_read_timeout to 600s or more
Black images generated VAE issue Download and activate a compatible VAE
Extensions not installing No internet or wrong path Check server has internet access

Summary {#verdict}

What you built:

  • Stable Diffusion WebUI running on a cloud server
  • Systemd service for automatic start/restart
  • HTTPS access via Nginx with HTTP Basic Auth
  • Base model downloaded and working
  • Optional: VAE and LoRA support configured

The main trade-off: CPU inference is slow. If image generation speed matters, budget for a GPU instance. For occasional personal use or experimentation, CPU mode works and keeps costs minimal.

Frequently Asked Questions {#faq}

How much RAM do I need to run Stable Diffusion on a VPS?
It depends on the model size. 3B parameter models need ~3–4 GB RAM; 7B models need ~5–6 GB; 13B+ models need 12+ GB. Check the requirements section for specific recommendations.

Can Stable Diffusion run on a CPU-only server without a GPU?
Yes, but inference speed varies significantly. 3B models are responsive on CPU. 7B+ models are noticeably slower without GPU acceleration. For production AI workloads, consider a GPU instance.

Is my data private when using self-hosted AI models?
Yes — data is processed entirely on your server with no external API calls. Conversations, documents, and prompts never leave your infrastructure. This is a key advantage of self-hosting AI.

What is the TencentOS AI image and should I use it?
The TencentOS AI application image comes pre-installed with Python 3, Docker, PyTorch, TensorFlow, PaddlePaddle, and GPU drivers. It eliminates hours of manual CUDA and AI framework setup. Strongly recommended for GPU-accelerated AI workloads.

Can I use Stable Diffusion as a drop-in replacement for the OpenAI API?
Many self-hosted AI tools provide OpenAI-compatible API endpoints. You can often switch your application by just changing the base_url to your server address.

👉 Get started with Tencent Cloud Lighthouse
👉 View current pricing and launch promotions
👉 Explore all active deals and offers