Building Agentic RAG Applications Based on PostgreSQL

Download

Focus Mode

Font Size

Last updated: 2026-06-08 10:32:14

Overview
Agentic RAG is a new paradigm that combines AI Agents with RAG (Retrieval-Augmented Generation). An Agent not only passively answers questions but also actively plans search policies, adaptively optimizes queries, performs multi-step reasoning, and invokes tools. This document describes how to build a complete Agentic RAG application based on Tencent Cloud PostgreSQL.
Architecture Design
┌──────────────────────────────────────────────────────┐
│                 Agentic RAG System                      │
├──────────────────────────────────────────────────────┤
│                                                      │
│  ┌──────────┐    ┌───────────┐    ┌──────────────┐   │
│  │ Planner  │    │ Retriever │    │ Generator    │   │
│  │ Agent    │ →  │ Agent     │ →  │ Agent        │   │
│  │ (Planning)   │    │ (Search)    │    │ (Generation)       │   │
│  └──────────┘    └───────────┘    └──────────────┘   │
│       ↓               ↓                ↓             │
│  ┌─────────────────────────────────────────────┐     │
│  │        PostgreSQL (Unified Storage Layer)              │     │
│  ├─────────────────────────────────────────────┤     │
│  │ pgvector     │ Apache AGE │ Memory Table          │     │
│  │ (Vector Search)   │ (Knowledge Graph) │ (Agent Memory)    │     │
│  │              │            │                 │     │
│  │ tencentdb_ai (Large Model Invocation)                  │     │
│  └─────────────────────────────────────────────┘     │
└──────────────────────────────────────────────────────┘
Differences from Traditional RAG
Level
Traditional RAG
Agentic RAG
Retrieval Policy
Fixed (single vector search)
Adaptive (multiple times / multiple policies)
Query Rewrite
None or simple rewriting
Agent-driven dynamic rewriting and decomposition
Result Evaluation
Direct use
The Agent evaluates and then decides whether more searches are needed.
Tool Call
None
Can invoke SQL APIs and external services
Multi-hop Reasoning
Not supported.
Supports multi-step reasoning chains
Memory
Stateless
Has long-term memory
Database Preparation
-- Knowledge Base Table
CREATE TABLE rag.documents (
    id BIGSERIAL PRIMARY KEY,
    title TEXT,
    content TEXT NOT NULL,
    chunk_index INT DEFAULT 0,
    embedding vector(1024),
    source TEXT,
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW()
);
﻿
CREATE INDEX idx_docs_embedding ON rag.documents
USING hnsw (embedding vector_cosine_ops);
﻿
-- Agent Search Memory
CREATE TABLE rag.search_memory (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    session_id UUID NOT NULL,
    query TEXT NOT NULL,
    strategy TEXT,
    results_summary TEXT,
    satisfaction FLOAT,  -- Agent's satisfaction with the result
    created_at TIMESTAMPTZ DEFAULT NOW()
);
﻿
-- Query Rewriting Log
CREATE TABLE rag.query_rewrites (
    id BIGSERIAL PRIMARY KEY,
    original_query TEXT,
    rewritten_query TEXT,
    rewrite_reason TEXT,
    session_id UUID,
    created_at TIMESTAMPTZ DEFAULT NOW()
);
Core Implementation
Note:
The vector dimension must match the output dimension of the large model you are using. Common model dimensions: OpenAI text-embedding-3-small (1536 dimensions), BGE-M3 (1024 dimensions), ChatGLM Embedding (1024 dimensions).
Intelligent Query Rewriting
-- Agent Rewrites Queries Using a Large Model
CREATE OR REPLACE FUNCTION rag.rewrite_query(original TEXT)
RETURNS TEXT AS $$
DECLARE
    rewritten TEXT;
BEGIN
    SELECT tencentdb_ai.chat_completions(
        '<your-llm-model>',
        'You are a query optimization expert. Please rewrite the following user question into a query that is more suitable for vector search. Only output the rewritten query, without any explanation.' ||
        E'\\n\\nOriginal Question: ' || original
    ) INTO rewritten;
﻿
    -- Record Rewriting
    INSERT INTO rag.query_rewrites (original_query, rewritten_query)
    VALUES (original, rewritten);
﻿
    RETURN rewritten;
END;
$$ LANGUAGE plpgsql;
Multi-Policy Search
-- Vector Search
CREATE OR REPLACE FUNCTION rag.vector_search(query TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, score FLOAT) AS $$
BEGIN
    RETURN QUERY
    SELECT d.id, d.content,
           1 - (d.embedding <=> tencentdb_ai.get_embedding('bge-m3', query)) AS score
    FROM rag.documents d
    ORDER BY d.embedding <=> tencentdb_ai.get_embedding('bge-m3', query)
    LIMIT top_k;
END;
$$ LANGUAGE plpgsql;
﻿
-- Keyword Search (BM25 Style)
CREATE OR REPLACE FUNCTION rag.keyword_search(keywords TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, score FLOAT) AS $$
BEGIN
    RETURN QUERY
    SELECT d.id, d.content,
           ts_rank(to_tsvector('chinese', d.content), plainto_tsquery('chinese', keywords))::FLOAT AS score
    FROM rag.documents d
    WHERE to_tsvector('chinese', d.content) @@ plainto_tsquery('chinese', keywords)
    ORDER BY score DESC
    LIMIT top_k;
END;
$$ LANGUAGE plpgsql;
﻿
-- Hybrid Search (RRF Fusion)
CREATE OR REPLACE FUNCTION rag.hybrid_search(query TEXT, top_k INT DEFAULT 5)
RETURNS TABLE(id BIGINT, content TEXT, rrf_score FLOAT) AS $$
BEGIN
    RETURN QUERY
    WITH vec AS (
        SELECT id, content, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
        FROM rag.vector_search(query, 20)
    ),
    kw AS (
        SELECT id, content, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
        FROM rag.keyword_search(query, 20)
    )
    SELECT COALESCE(v.id, k.id),
           COALESCE(v.content, k.content),
           (COALESCE(1.0/(60+v.rank), 0) + COALESCE(1.0/(60+k.rank), 0))::FLOAT AS rrf_score
    FROM vec v FULL OUTER JOIN kw k ON v.id = k.id
    ORDER BY rrf_score DESC
    LIMIT top_k;
END;
$$ LANGUAGE plpgsql;
Agent-Adaptive RAG Workflow
-- Complete Agentic RAG Function
CREATE OR REPLACE FUNCTION rag.agentic_answer(question TEXT)
RETURNS TEXT AS $$
DECLARE
    rewritten_query TEXT;
    context TEXT;
    answer TEXT;
BEGIN
    -- Step 1: Query Rewriting
    rewritten_query := rag.rewrite_query(question);
﻿
    -- Step 2: Hybrid Search
    SELECT string_agg(content, E'\\n---\\n') INTO context
    FROM rag.hybrid_search(rewritten_query, 5);
﻿
    -- Step 3: Generate the Response
    SELECT tencentdb_ai.chat_completions(
        '<your-llm-model>',
        Answer the user's question based on the following reference materials. If the materials are insufficient, explicitly state this. ||
        E'\\n\\nReference Materials:\\n' || COALESCE(context, 'No relevant materials') ||
        E'\\n\\nUser Question: ' || question
    ) INTO answer;
﻿
    RETURN answer;
END;
$$ LANGUAGE plpgsql;
Best Practices
1. Document Chunking: It is recommended to set chunk_size = 500–1000 characters and overlap = 100.
2. Hybrid Search: The RRF fusion of vector + keyword search yields better results than using either method alone.
3. Query Rewriting: It decomposes and rewrites complex queries.
4. Result Reranking: The Rerank model is used to perform secondary sorting on the search results.
5. Memory Reuse: It reuses historical search results for similar queries.
References
pgvectorscale (DiskANN) high-performance vector search.
pgvector Vector Extension Usage Guide.
Overview of Agent Long-term Memory Feature.
Building GraphRAG Knowledge Graph Applications Based on Apache AGE.
﻿

Help and Support

Was this page helpful?

You can also Contact sales or Submit a Ticket for help.

Help us improve! Rate your documentation experience in 5 mins.

Feedback

tencent cloud

TencentDB for PostgreSQL

Building Agentic RAG Applications Based on PostgreSQL

Overview

Architecture Design

Differences from Traditional RAG

Database Preparation

Core Implementation

Intelligent Query Rewriting

Multi-Policy Search

Agent-Adaptive RAG Workflow

Best Practices

References

Help and Support

Level	Traditional RAG	Agentic RAG
Retrieval Policy	Fixed (single vector search)	Adaptive (multiple times / multiple policies)
Query Rewrite	None or simple rewriting	Agent-driven dynamic rewriting and decomposition
Result Evaluation	Direct use	The Agent evaluates and then decides whether more searches are needed.
Tool Call	None	Can invoke SQL APIs and external services
Multi-hop Reasoning	Not supported.	Supports multi-step reasoning chains
Memory	Stateless	Has long-term memory