tencent cloud

TencentDB for PostgreSQL

DocumentationTencentDB for PostgreSQLAI CapabilitiesPractical TutorialBuilding Agentic RAG Applications Based on PostgreSQL

Building Agentic RAG Applications Based on PostgreSQL

Download
Focus Mode
Font Size
Last updated: 2026-06-08 10:32:14

Overview

Agentic RAG is a new paradigm that combines AI Agents with RAG (Retrieval-Augmented Generation). An Agent not only passively answers questions but also actively plans search policies, adaptively optimizes queries, performs multi-step reasoning, and invokes tools. This document describes how to build a complete Agentic RAG application based on Tencent Cloud PostgreSQL.

Architecture Design

┌──────────────────────────────────────────────────────┐
│ Agentic RAG System │
├──────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Planner │ │ Retriever │ │ Generator │ │
│ │ Agent │ → │ Agent │ → │ Agent │ │
│ │ (Planning) │ │ (Search) │ │ (Generation) │ │
│ └──────────┘ └───────────┘ └──────────────┘ │
│ ↓ ↓ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ PostgreSQL (Unified Storage Layer) │ │
│ ├─────────────────────────────────────────────┤ │
│ │ pgvector │ Apache AGE │ Memory Table │ │
│ │ (Vector Search)(Knowledge Graph)(Agent Memory) │ │
│ │ │ │ │ │
│ │ tencentdb_ai (Large Model Invocation) │ │
│ └─────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘

Differences from Traditional RAG

Level
Traditional RAG
Agentic RAG
Retrieval Policy
Fixed (single vector search)
Adaptive (multiple times / multiple policies)
Query Rewrite
None or simple rewriting
Agent-driven dynamic rewriting and decomposition
Result Evaluation
Direct use
The Agent evaluates and then decides whether more searches are needed.
Tool Call
None
Can invoke SQL APIs and external services
Multi-hop Reasoning
Not supported.
Supports multi-step reasoning chains
Memory
Stateless
Has long-term memory

Database Preparation

-- Knowledge Base Table
CREATE TABLE rag.documents (
id BIGSERIAL PRIMARY KEY,
title TEXT,
content TEXT NOT NULL,
chunk_index INT DEFAULT 0,
embedding vector(1024),
source TEXT,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_docs_embedding ON rag.documents
USING hnsw (embedding vector_cosine_ops);

-- Agent Search Memory
CREATE TABLE rag.search_memory (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL,
query TEXT NOT NULL,
strategy TEXT,
results_summary TEXT,
satisfaction FLOAT, -- Agent's satisfaction with the result
created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Query Rewriting Log
CREATE TABLE rag.query_rewrites (
id BIGSERIAL PRIMARY KEY,
original_query TEXT,
rewritten_query TEXT,
rewrite_reason TEXT,
session_id UUID,
created_at TIMESTAMPTZ DEFAULT NOW()
);

Core Implementation

Note:
The vector dimension must match the output dimension of the large model you are using. Common model dimensions: OpenAI text-embedding-3-small (1536 dimensions), BGE-M3 (1024 dimensions), ChatGLM Embedding (1024 dimensions).

Intelligent Query Rewriting

-- Agent Rewrites Queries Using a Large Model
CREATE OR REPLACE FUNCTION rag.rewrite_query(original TEXT)
RETURNS TEXT AS $$
DECLARE
rewritten TEXT;
BEGIN
SELECT tencentdb_ai.chat_completions(
'<your-llm-model>',
'You are a query optimization expert. Please rewrite the following user question into a query that is more suitable for vector search. Only output the rewritten query, without any explanation.' ||
E'\\n\\nOriginal Question: ' || original
) INTO rewritten;

-- Record Rewriting
INSERT INTO rag.query_rewrites (original_query, rewritten_query)
VALUES (original, rewritten);

RETURN rewritten;
END;
$$ LANGUAGE plpgsql;

Multi-Policy Search

-- Vector Search
CREATE OR REPLACE FUNCTION rag.vector_search(query TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, score FLOAT) AS $$
BEGIN
RETURN QUERY
SELECT d.id, d.content,
1 - (d.embedding <=> tencentdb_ai.get_embedding('bge-m3', query)) AS score
FROM rag.documents d
ORDER BY d.embedding <=> tencentdb_ai.get_embedding('bge-m3', query)
LIMIT top_k;
END;
$$ LANGUAGE plpgsql;

-- Keyword Search (BM25 Style)
CREATE OR REPLACE FUNCTION rag.keyword_search(keywords TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, score FLOAT) AS $$
BEGIN
RETURN QUERY
SELECT d.id, d.content,
ts_rank(to_tsvector('chinese', d.content), plainto_tsquery('chinese', keywords))::FLOAT AS score
FROM rag.documents d
WHERE to_tsvector('chinese', d.content) @@ plainto_tsquery('chinese', keywords)
ORDER BY score DESC
LIMIT top_k;
END;
$$ LANGUAGE plpgsql;

-- Hybrid Search (RRF Fusion)
CREATE OR REPLACE FUNCTION rag.hybrid_search(query TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, rrf_score FLOAT) AS $$
BEGIN
RETURN QUERY
WITH vec AS (
SELECT id, content, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
FROM rag.vector_search(query, 20)
),
kw AS (
SELECT id, content, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
FROM rag.keyword_search(query, 20)
)
SELECT COALESCE(v.id, k.id),
COALESCE(v.content, k.content),
(COALESCE(1.0/(60+v.rank), 0) + COALESCE(1.0/(60+k.rank), 0))::FLOAT AS rrf_score
FROM vec v FULL OUTER JOIN kw k ON v.id = k.id
ORDER BY rrf_score DESC
LIMIT top_k;
END;
$$ LANGUAGE plpgsql;

Agent-Adaptive RAG Workflow

-- Complete Agentic RAG Function
CREATE OR REPLACE FUNCTION rag.agentic_answer(question TEXT)
RETURNS TEXT AS $$
DECLARE
rewritten_query TEXT;
context TEXT;
answer TEXT;
BEGIN
-- Step 1: Query Rewriting
rewritten_query := rag.rewrite_query(question);

-- Step 2: Hybrid Search
SELECT string_agg(content, E'\\n---\\n') INTO context
FROM rag.hybrid_search(rewritten_query, 5);

-- Step 3: Generate the Response
SELECT tencentdb_ai.chat_completions(
'<your-llm-model>',
Answer the user's question based on the following reference materials. If the materials are insufficient, explicitly state this. ||
E'\\n\\nReference Materials:\\n' || COALESCE(context, 'No relevant materials') ||
E'\\n\\nUser Question: ' || question
) INTO answer;

RETURN answer;
END;
$$ LANGUAGE plpgsql;

Best Practices

1. Document Chunking: It is recommended to set chunk_size = 500–1000 characters and overlap = 100.
2. Hybrid Search: The RRF fusion of vector + keyword search yields better results than using either method alone.
3. Query Rewriting: It decomposes and rewrites complex queries.
4. Result Reranking: The Rerank model is used to perform secondary sorting on the search results.
5. Memory Reuse: It reuses historical search results for similar queries.

References



Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback