I built a RAG pipeline from scratch in Python once. It took three days: loading documents, chunking, embeddings, vector store, retrieval chain, prompt engineering. It worked, but I learned that most of that code is boilerplate.
Flowise lets you build the same pipeline in an afternoon using a visual drag-and-drop editor. Each component — document loader, embeddings model, vector store, LLM — is a node you connect visually. Change the model? Swap one node. Add a memory layer? Drag in a memory node and connect it.
Connecting it to a local Ollama instance means no API costs for prototyping. When you're happy with the workflow, export it as an API endpoint and call it from any application.
It's particularly useful for prototyping AI features quickly, building chatbots, or creating automated pipelines that non-developers can configure and modify.
I run Flowise on Tencent Cloud Lighthouse. The 2 GB RAM plan handles Flowise itself; if you're running Ollama on the same server for local models, use 4 GB RAM. For AI workflow setups that include local models, Lighthouse's TencentOS AI application image is a strong starting point — it comes pre-installed with Python 3, Node.js, Docker, Git, and AI frameworks (PyTorch, TensorFlow), so the environment for running Flowise alongside Ollama is ready without dependency setup. Running Flowise on a server means it's available 24/7 — webhook-triggered flows execute on schedule even when your laptop is closed.
- Key Takeaways
Flowise has pre-built nodes for:
| Category | Examples |
|---|---|
| LLMs | OpenAI, Anthropic, Ollama, Azure OpenAI, Hugging Face |
| Document Loaders | PDF, web pages, Notion, GitHub, S3, CSV |
| Vector Stores | Chroma, Pinecone, Weaviate, Qdrant, FAISS |
| Memory | Buffer memory, Redis-backed, conversation summary |
| Tools | Web search, calculator, code execution, API calls |
| Chains | RAG chains, conversation chains, agent chains |
| Agents | ReAct agents, OpenAI function agents |
Popular use cases:
| Requirement | Details |
|---|---|
| Server | Ubuntu 22.04, 2 GB+ RAM |
| Node.js | v18+ |
| Domain | For HTTPS setup |
| API keys | For cloud LLMs (or Ollama for local) |
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo bash -
sudo apt install -y nodejs
node --version # Should be v20.x
sudo npm install -g flowise
npx flowise start
On first run, Flowise downloads dependencies (takes 1–2 minutes). You should see:
Starting Flowise...
[Server]: Flowise Server is listening at http://localhost:3000
On your local machine:
ssh -L 3000:localhost:3000 ubuntu@YOUR_SERVER_IP
Open http://localhost:3000 in your browser. You'll see the Flowise canvas interface.
By default, Flowise has no authentication. Set credentials via environment variables:
export FLOWISE_USERNAME=admin
export FLOWISE_PASSWORD=your-strong-password
npx flowise start
Or create a .env file:
nano /opt/flowise/.env
FLOWISE_USERNAME=admin
FLOWISE_PASSWORD=your-strong-password
PORT=3000
DATABASE_PATH=/opt/flowise/database
SECRETKEY_PATH=/opt/flowise/secretkey
LOG_PATH=/opt/flowise/logs
mkdir -p /opt/flowise/{database,secretkey,logs}
sudo apt install -y nginx certbot python3-certbot-nginx
sudo nano /etc/nginx/sites-available/flowise
server {
listen 80;
server_name flowise.yourdomain.com;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Long timeout for AI operations
proxy_read_timeout 300s;
proxy_send_timeout 300s;
# For file uploads
client_max_body_size 100m;
}
}
sudo ln -s /etc/nginx/sites-available/flowise /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d flowise.yourdomain.com
sudo nano /etc/systemd/system/flowise.service
[Unit]
Description=Flowise AI Workflow Builder
After=network.target
[Service]
Type=simple
User=ubuntu
WorkingDirectory=/opt/flowise
EnvironmentFile=/opt/flowise/.env
ExecStart=/usr/bin/npx flowise start
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable flowise
sudo systemctl start flowise
sudo systemctl status flowise
Access Flowise at https://flowise.yourdomain.com.
Let's build a simple document Q&A flow.
Log in to Flowise and click + New to create a new chatflow.
Step 1 — Add a PDF Loader:
PDF FileStep 2 — Add a Text Splitter:
Recursive Character Text Splitter1000, Chunk Overlap 200Step 3 — Add an Embedding Model:
OpenAI Embeddings (or Ollama Embeddings for local)Step 4 — Add a Vector Store:
In-Memory Vector Store (simplest, no setup)Step 5 — Add a Retrieval Chain:
Conversational Retrieval QA ChainStep 6 — Save and Test:
Flowise includes templates for common use cases. Click Marketplaces in the left sidebar to browse and import complete flows — RAG chatbots, web scrapers, AI agents — and modify them for your needs.
If you have Ollama running on the same server:
In the flow canvas:
Ollama in the node panelhttp://localhost:11434llama3.2:3b (or any model you have pulled)OllamaEmbeddingshttp://localhost:11434nomic-embed-textConnect these nodes in your RAG flow instead of OpenAI nodes. No API key required.
A practical setup: use Ollama for embeddings (free, local) and OpenAI for final answer generation (higher quality, paid). Or use Ollama for most queries and only call OpenAI for complex ones.
Every saved flow gets a REST API endpoint automatically.
In Flowise, open your saved flow and click the API button (code icon, top right).
You'll see:
POST https://flowise.yourdomain.com/api/v1/prediction/FLOW-ID
curl -X POST https://flowise.yourdomain.com/api/v1/prediction/YOUR-FLOW-ID \
-H "Content-Type: application/json" \
-d '{"question": "What are the main topics in this document?"}'
import requests
FLOWISE_URL = "https://flowise.yourdomain.com/api/v1/prediction/YOUR-FLOW-ID"
def ask_document_bot(question: str) -> str:
response = requests.post(
FLOWISE_URL,
json={"question": question},
headers={"Content-Type": "application/json"}
)
return response.json()["text"]
answer = ask_document_bot("What is the return policy?")
print(answer)
In Flowise settings, go to API Keys → Create API Key. Use it in requests:
curl -X POST https://flowise.yourdomain.com/api/v1/prediction/FLOW-ID \
-H "Authorization: Bearer YOUR-FLOWISE-API-KEY" \
-H "Content-Type: application/json" \
-d '{"question": "Your question here"}'
I built a RAG flow with PDF documents and it worked perfectly in Flowise's built-in chat. But when I called it via the API, I kept getting empty responses — the API returned {"text": ""}.
The issue: the PDF loader was uploading documents through the Flowise UI, but those uploaded files weren't persistent between server restarts. After the server restarted, Flowise had the flow structure but no actual documents loaded in memory. The API call had nothing to retrieve from.
The fix: For production use, don't rely on the in-memory vector store or file uploads through the UI. Instead:
Alternative: Use the Upsert API to programmatically add documents to the vector store before querying.
For my use case, I switched to a ChromaDB node with a persistent directory path:
Persist Directory: /opt/flowise/chroma-data
Collection Name: my-docs
After this change, the vector store persists across restarts and API calls work reliably.
| Issue | Likely Cause | Fix |
|---|---|---|
| Empty API responses | Documents not persisted | Use persistent vector store (Chroma with file path) |
| Flowise not starting | Port 3000 in use | Change PORT= in .env |
| Can't connect to Ollama | URL mismatch | Verify http://localhost:11434 is reachable from service |
| File upload fails | Size limit | Increase client_max_body_size in Nginx |
| Flow not saving | Storage path permission | Check DATABASE_PATH directory is writable by ubuntu user |
| Chat UI not loading | WebSocket issue | Check Nginx includes Upgrade and Connection headers |
| API returns 401 | API key not set | Add key in Flowise settings, include in Authorization header |
✅ What you built:
Flowise dramatically speeds up AI feature development. What would take hours of Python coding can be assembled in minutes with nodes and tested immediately in the built-in chat.
How much RAM do I need to run Flowise on a VPS?
It depends on the model size. 3B parameter models need ~3–4 GB RAM; 7B models need ~5–6 GB; 13B+ models need 12+ GB. Check the requirements section for specific recommendations.
Can Flowise run on a CPU-only server without a GPU?
Yes, but inference speed varies significantly. 3B models are responsive on CPU. 7B+ models are noticeably slower without GPU acceleration. For production AI workloads, consider a GPU instance.
Is my data private when using self-hosted AI models?
Yes — data is processed entirely on your server with no external API calls. Conversations, documents, and prompts never leave your infrastructure. This is a key advantage of self-hosting AI.
What is the TencentOS AI image and should I use it?
The TencentOS AI application image comes pre-installed with Python 3, Docker, PyTorch, TensorFlow, PaddlePaddle, and GPU drivers. It eliminates hours of manual CUDA and AI framework setup. Strongly recommended for GPU-accelerated AI workloads.
base_url to your server address.👉 Get started with Tencent Cloud Lighthouse
👉 View current pricing and launch promotions
👉 Explore all active deals and offers