AI RAG

Retrieval-Augmented Generation (RAG) lets agents and the chat interface answer questions grounded in your application's actual data — table records and uploaded documents — instead of relying solely on the model's training data. Before generating a response, the runtime retrieves the most relevant content from a vector-indexed knowledge base and injects it into context.

RAG is native once AI_PROVIDER is configured: the embedding infrastructure is provisioned automatically, with no YAML required to wire up storage. Schema authors declare what to embed (in an agent's knowledge block); operators tune how via env vars.

Architecture

Knowledge Sources                         Vector Store
    │                                          │
    ├── Table Records ───▶ Chunk + Embed ───▶  │
    │   (specified fields)                     │
    │                                          │
    └── Documents ───────▶ Chunk + Embed ───▶  │
        (PDF, MD, TXT)                         ▼
                                       Similarity Search
                                              │
                                              ▼
                       AI Provider (generates a grounded response)

Dialect-Aware Storage

RAG works on both database dialects via the AiEmbeddingRepository port — there is no external vector database on either.

Dialect	Storage	Similarity
PostgreSQL	pgvector extension; `vector` column in the `ai` schema	Cosine distance computed in SQL (HNSW-indexed).
SQLite	`Float32` BLOB column (packed bytes)	Cosine similarity computed in application code.

The frugal default — SQLite + Ollama — supports RAG. SQLite stores vectors as packed Float32 BLOBs and computes cosine similarity in application code, normalized to match the Postgres contract. The response envelope and each result's shape (agentName, sourceRef, content, similarity) are identical across dialects, so callers never branch on the storage engine.

Knowledge Sources

An agent's knowledge block defines the input data sources embedded into its knowledge base. Both source types are optional and can be combined.

Table Knowledge

Embed specified fields from a table. Only text-like fields (single-line-text, long-text, rich-text, markdown) should be embedded. An optional filter limits which rows are included.

agents:
  - name: support-agent
    role: support
    systemPrompt: Answer using the FAQ and published docs.
    knowledge:
      tables:
        - { table: faq, fields: [question, answer] }
        - { table: docs, fields: [content], filter: { status: published } }

Property	Description
`table`	Table name to embed (must reference a table in `app.tables`).
`fields`	Field names to include in embeddings (at least one).
`filter`	Optional key-value equality filter selecting which rows are embedded.

When source records change, embeddings are updated automatically (auto-sync).

Document Knowledge

Embed document files (PDF, Markdown, plain text) discovered in the knowledge directory.

knowledge:
  documents:
    - { path: /knowledge/product-manual.pdf, label: Product Manual }

Property	Description
`path`	File path to the document.
`label`	Optional human-readable label for the source.

Documents placed in AI_KNOWLEDGE_DIR are natively discovered, parsed, chunked, embedded, and stored.

Format	Extension	Notes
PDF	`.pdf`	Text-only; scanned PDFs not supported.
Markdown	`.md`	Formatting stripped, structure kept.
Plain text	`.txt`	Ingested directly.

Per-Agent Knowledge Scoping

Each agent has its own isolated knowledge base, keyed by agent name. One agent never retrieves another agent's embeddings. Chat can additionally access knowledge scoped to the requesting user's permissions. This isolation lets a support agent and a sales agent embed entirely different corpora without cross-contamination.

Embedding & Retrieval Configuration

Chunking, embedding, and retrieval are tuned via environment variables.

Variable	Description	Default
`AI_EMBEDDING_MODEL`	Embedding model to use.	Provider's default embedding model
`AI_EMBEDDING_DIMENSIONS`	Vector dimensions (must match the model output).	Auto-detected from the model
`AI_KNOWLEDGE_DIR`	Path to the knowledge documents folder.	`./knowledge`
`AI_RAG_CHUNK_SIZE`	Characters per chunk.	`512`
`AI_RAG_CHUNK_OVERLAP`	Overlap between adjacent chunks.	`50`
`AI_RAG_SIMILARITY`	Minimum cosine similarity to retain a result (0–1).	`0.7`
`AI_RAG_MAX_RESULTS`	Maximum chunks returned per query.	`5`

Rebuild & Search APIs

Endpoint	Purpose
`POST /api/ai/rag/rebuild`	Re-embeds the configured agent/document knowledge and persists vectors. Admin-only when `app.auth` is configured.
`POST /api/ai/rag/search`	Embeds a query, runs similarity search (optionally scoped by `agent`), filters by the similarity threshold, and returns ranked results.

Both endpoints behave identically across PostgreSQL and SQLite — same authorization, same response shape.

AI Overview — the full AI ecosystem.
AI Agents — agents that own knowledge bases.
AI Memory — runtime knowledge retrieval vs. embedded sources.
AI Providers — embedding model configuration.
Environment Variables — full AI_RAG_* reference.

← PreviousAI Agents Next →AI Memory