Stable Diffusion is one of the most capable open-source image generation models available. Most people run it on a local machine with a GPU, but there's a good case for running it on a cloud server: your desktop doesn't need to be on, you can access it from any device, and cloud servers with GPUs are available at reasonable hourly rates.
This guide covers deploying AUTOMATIC1111's Stable Diffusion WebUI on a cloud server, making it accessible via a browser interface, and keeping it running as a persistent service.
Important hardware note: Stable Diffusion can run on CPU-only servers, but it's very slow — 5–15 minutes per image on a typical VPS. For practical use, a GPU-equipped instance is strongly recommended. Tencent Cloud Lighthouse has GPU instance options in select regions. Even better: Lighthouse offers a TencentOS AI application image that comes pre-installed with Python 3, Docker, Git, PyTorch, TensorFlow, PaddlePaddle, and GPU drivers — you skip the hours-long CUDA + driver setup and go straight to installing Stable Diffusion WebUI. Select the TencentOS AI image when creating a GPU instance, and the environment is ready to use immediately.
- Key Takeaways
| Server Type | Generation Speed | Practical Use |
|---|---|---|
| CPU-only (4 vCPU) | 5–15 minutes/image | Testing, setup verification |
| GPU 4–8 GB VRAM | 15–45 seconds/image | Good for regular use |
| GPU 16+ GB VRAM | 3–10 seconds/image | High-throughput image generation |
For the best experience, use a server with at least 4 GB VRAM. If you're just testing the setup, CPU mode works fine.
| Requirement | CPU Mode | GPU Mode |
|---|---|---|
| RAM | 8 GB minimum | 16 GB recommended |
| Storage | 20 GB+ (models are large) | 30 GB+ |
| OS | Ubuntu 22.04 | Ubuntu 22.04 |
| Python | 3.10+ | 3.10+ |
| CUDA (GPU only) | Not needed | CUDA 11.8 or 12.x |
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3 python3-pip python3-venv python3-dev
python3 --version
# Should be 3.10 or higher
sudo apt install -y build-essential git wget curl libgl1 libglib2.0-0 libsm6 libxrender1 libxext6
⚡ Major shortcut: If you select the TencentOS AI application image when creating your Lighthouse GPU instance, NVIDIA drivers, CUDA, Python 3, PyTorch, and TensorFlow are already installed and configured. You can skip this entire section (1.4) and jump directly to Part 2. This is strongly recommended for GPU setups — CUDA driver installation is the most error-prone part of the process.
If you chose a plain Ubuntu system image, install drivers manually:
Check if your server has an NVIDIA GPU:
lspci | grep -i nvidia
If yes, install drivers:
# Check available NVIDIA driver versions
sudo apt search nvidia-driver | grep "nvidia-driver-[0-9]"
# Install recommended driver (adjust version number)
sudo apt install -y nvidia-driver-535
# Reboot to load driver
sudo reboot
After reboot, verify GPU is detected:
nvidia-smi
Install CUDA toolkit (for torch GPU support):
# Install CUDA 11.8
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt update
sudo apt install -y cuda-11-8
cd /opt
sudo git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
sudo chown -R ubuntu:ubuntu stable-diffusion-webui
cd stable-diffusion-webui
The WebUI needs at least one Stable Diffusion model checkpoint to generate images. Download from Hugging Face or CivitAI.
Create the models directory:
mkdir -p /opt/stable-diffusion-webui/models/Stable-diffusion
cd /opt/stable-diffusion-webui/models/Stable-diffusion
Download a model (example — v1-5 base model, ~4 GB):
wget -c "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt" \
-O v1-5-pruned-emaonly.ckpt
Or for SDXL (better quality, requires more VRAM/RAM):
wget -c "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors" \
-O sd_xl_base_1.0.safetensors
The first run installs Python dependencies automatically (takes 10–15 minutes):
For CPU-only:
cd /opt/stable-diffusion-webui
./webui.sh --no-half --skip-torch-cuda-test --listen --port 7860
For GPU:
cd /opt/stable-diffusion-webui
./webui.sh --listen --port 7860
--listen makes WebUI accessible on all interfaces (needed for remote access). --no-half disables half-precision math, required for CPU-only mode.
Wait for: Running on local URL: http://0.0.0.0:7860
On your local machine:
ssh -L 7860:localhost:7860 ubuntu@YOUR_SERVER_IP
Open http://localhost:7860 in your browser.
You'll see the Stable Diffusion WebUI interface. Type a prompt in the text box and click Generate.
Enter a prompt like:
A serene mountain lake at sunset, photorealistic, highly detailed, 8k
Negative prompt:
blurry, low quality, distorted, ugly
Click Generate. On CPU mode, this takes several minutes. On GPU, 15–45 seconds.
Keep WebUI running after you close the terminal.
Create a startup script:
nano /opt/stable-diffusion-webui/start.sh
CPU version:
#!/bin/bash
cd /opt/stable-diffusion-webui
./webui.sh \
--no-half \
--skip-torch-cuda-test \
--listen \
--port 7860 \
--api \
--enable-insecure-extension-access
GPU version:
#!/bin/bash
cd /opt/stable-diffusion-webui
./webui.sh \
--listen \
--port 7860 \
--api \
--xformers
chmod +x /opt/stable-diffusion-webui/start.sh
Create a systemd service:
sudo nano /etc/systemd/system/sdwebui.service
[Unit]
Description=Stable Diffusion WebUI
After=network.target
[Service]
Type=simple
User=ubuntu
WorkingDirectory=/opt/stable-diffusion-webui
ExecStart=/opt/stable-diffusion-webui/start.sh
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable sdwebui
sudo systemctl start sdwebui
sudo systemctl status sdwebui
Follow logs:
sudo journalctl -u sdwebui -f
Add HTTP Basic Auth to prevent unauthorized access.
sudo apt install -y nginx certbot python3-certbot-nginx apache2-utils
sudo htpasswd -c /etc/nginx/.htpasswd yourusername
# Enter a password when prompted
sudo nano /etc/nginx/sites-available/sdwebui
server {
listen 80;
server_name sd.yourdomain.com;
auth_basic "Stable Diffusion";
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
proxy_pass http://localhost:7860;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Long timeout for generation requests
proxy_read_timeout 600s;
proxy_send_timeout 600s;
# Large body for image uploads
client_max_body_size 100m;
}
}
sudo ln -s /etc/nginx/sites-available/sdwebui /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d sd.yourdomain.com
Good community fine-tuned models for specific styles:
cd /opt/stable-diffusion-webui/models/Stable-diffusion
# Realistic Vision — photorealistic portraits
wget -c "https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE/resolve/main/Realistic_Vision_V6.0_NV_B1.safetensors"
# DreamShaper — versatile artistic model
wget -c "https://civitai.com/api/download/models/128713" -O dreamshaper_8.safetensors
mkdir -p /opt/stable-diffusion-webui/models/VAE
cd /opt/stable-diffusion-webui/models/VAE
wget -c "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors"
Activate the VAE in WebUI: Settings → Stable Diffusion → SD VAE.
mkdir -p /opt/stable-diffusion-webui/models/Lora
Download LoRA files (.safetensors) from CivitAI and place them here. Reference in prompts with <lora:modelname:weight>.
On my first CPU-only run, the WebUI loaded but every generation failed with a CUDA out-of-memory error — strange, because I was running on CPU with no GPU.
The issue: PyTorch was installed with CUDA support (default), and even though no CUDA device was available, it was still trying to use CUDA ops. The --no-half flag wasn't enough.
The fix: Add --precision full and --no-half together:
./webui.sh --no-half --precision full --skip-torch-cuda-test --listen --port 7860
Also, if you see memory errors on CPU generation, lower the image resolution:
For CPU-only servers, keep images at 512×512 and use SD 1.5 models (not SDXL) — SDXL's base resolution is 1024×1024 and uses too much RAM without GPU acceleration.
| Issue | Likely Cause | Fix |
|---|---|---|
| Very slow generation | CPU mode | Expected; GPU greatly improves speed |
| CUDA errors on CPU server | Wrong PyTorch build | Add --skip-torch-cuda-test --no-half --precision full |
| Out of memory | Model too large | Use 512×512 resolution; use pruned model files |
| WebUI not loading | Service not started | sudo systemctl start sdwebui and check logs |
| Model not appearing in dropdown | Wrong directory | Check model is in models/Stable-diffusion/ |
| 502 from Nginx during generation | Proxy timeout | Increase proxy_read_timeout to 600s or more |
| Black images generated | VAE issue | Download and activate a compatible VAE |
| Extensions not installing | No internet or wrong path | Check server has internet access |
✅ What you built:
The main trade-off: CPU inference is slow. If image generation speed matters, budget for a GPU instance. For occasional personal use or experimentation, CPU mode works and keeps costs minimal.
How much RAM do I need to run Stable Diffusion on a VPS?
It depends on the model size. 3B parameter models need ~3–4 GB RAM; 7B models need ~5–6 GB; 13B+ models need 12+ GB. Check the requirements section for specific recommendations.
Can Stable Diffusion run on a CPU-only server without a GPU?
Yes, but inference speed varies significantly. 3B models are responsive on CPU. 7B+ models are noticeably slower without GPU acceleration. For production AI workloads, consider a GPU instance.
Is my data private when using self-hosted AI models?
Yes — data is processed entirely on your server with no external API calls. Conversations, documents, and prompts never leave your infrastructure. This is a key advantage of self-hosting AI.
What is the TencentOS AI image and should I use it?
The TencentOS AI application image comes pre-installed with Python 3, Docker, PyTorch, TensorFlow, PaddlePaddle, and GPU drivers. It eliminates hours of manual CUDA and AI framework setup. Strongly recommended for GPU-accelerated AI workloads.
base_url to your server address.👉 Get started with Tencent Cloud Lighthouse
👉 View current pricing and launch promotions
👉 Explore all active deals and offers