Skip to main content

AI RAG

Retrieval-Augmented Generation (RAG) lets agents and the chat interface answer questions grounded in your application's actual data — table records and uploaded documents — instead of relying solely on the model's training data. Before generating a response, the runtime retrieves the most relevant content from a vector-indexed knowledge base and injects it into context.

RAG is native once AI_PROVIDER is configured: the embedding infrastructure is provisioned automatically, with no YAML required to wire up storage. Schema authors declare what to embed (in an agent's knowledge block); operators tune how via env vars.

Architecture

Knowledge Sources                         Vector Store
    │                                          │
    ├── Table Records ───▶ Chunk + Embed ───▶  │
    │   (specified fields)                     │
    │                                          │
    └── Documents ───────▶ Chunk + Embed ───▶  │
        (PDF, MD, TXT)                         ▼
                                       Similarity Search
                                              │
                                              ▼
                       AI Provider (generates a grounded response)

Dialect-Aware Storage

RAG works on both database dialects via the AiEmbeddingRepository port — there is no external vector database on either.

Dialect Storage Similarity
PostgreSQL pgvector extension; vector column in the ai schema Cosine distance computed in SQL (HNSW-indexed).
SQLite Float32 BLOB column (packed bytes) Cosine similarity computed in application code.

Knowledge Sources

An agent's knowledge block defines the input data sources embedded into its knowledge base. Both source types are optional and can be combined.

Table Knowledge

Embed specified fields from a table. Only text-like fields (single-line-text, long-text, rich-text, markdown) should be embedded. An optional filter limits which rows are included.

agents:
  - name: support-agent
    role: support
    systemPrompt: Answer using the FAQ and published docs.
    knowledge:
      tables:
        - { table: faq, fields: [question, answer] }
        - { table: docs, fields: [content], filter: { status: published } }
Property Description
table Table name to embed (must reference a table in app.tables).
fields Field names to include in embeddings (at least one).
filter Optional key-value equality filter selecting which rows are embedded.

When source records change, embeddings are updated automatically (auto-sync).

Document Knowledge

Embed document files (PDF, Markdown, plain text) discovered in the knowledge directory.

knowledge:
  documents:
    - { path: /knowledge/product-manual.pdf, label: Product Manual }
Property Description
path File path to the document.
label Optional human-readable label for the source.

Documents placed in AI_KNOWLEDGE_DIR are natively discovered, parsed, chunked, embedded, and stored.

Format Extension Notes
PDF .pdf Text-only; scanned PDFs not supported.
Markdown .md Formatting stripped, structure kept.
Plain text .txt Ingested directly.

Per-Agent Knowledge Scoping

Each agent has its own isolated knowledge base, keyed by agent name. One agent never retrieves another agent's embeddings. Chat can additionally access knowledge scoped to the requesting user's permissions. This isolation lets a support agent and a sales agent embed entirely different corpora without cross-contamination.

Embedding & Retrieval Configuration

Chunking, embedding, and retrieval are tuned via environment variables.

Variable Description Default
AI_EMBEDDING_MODEL Embedding model to use. Provider's default embedding model
AI_EMBEDDING_DIMENSIONS Vector dimensions (must match the model output). Auto-detected from the model
AI_KNOWLEDGE_DIR Path to the knowledge documents folder. ./knowledge
AI_RAG_CHUNK_SIZE Characters per chunk. 512
AI_RAG_CHUNK_OVERLAP Overlap between adjacent chunks. 50
AI_RAG_SIMILARITY Minimum cosine similarity to retain a result (0–1). 0.7
AI_RAG_MAX_RESULTS Maximum chunks returned per query. 5

Rebuild & Search APIs

Endpoint Purpose
POST /api/ai/rag/rebuild Re-embeds the configured agent/document knowledge and persists vectors. Admin-only when app.auth is configured.
POST /api/ai/rag/search Embeds a query, runs similarity search (optionally scoped by agent), filters by the similarity threshold, and returns ranked results.

Both endpoints behave identically across PostgreSQL and SQLite — same authorization, same response shape.