AI RAG
Retrieval-Augmented Generation (RAG) lets agents and the chat interface answer questions grounded in your application's actual data — table records and uploaded documents — instead of relying solely on the model's training data. Before generating a response, the runtime retrieves the most relevant content from a vector-indexed knowledge base and injects it into context.
RAG is native once AI_PROVIDER is configured: the embedding infrastructure is provisioned automatically, with no YAML required to wire up storage. Schema authors declare what to embed (in an agent's knowledge block); operators tune how via env vars.
Architecture
Knowledge Sources Vector Store
│ │
├── Table Records ───▶ Chunk + Embed ───▶ │
│ (specified fields) │
│ │
└── Documents ───────▶ Chunk + Embed ───▶ │
(PDF, MD, TXT) ▼
Similarity Search
│
▼
AI Provider (generates a grounded response)
Dialect-Aware Storage
RAG works on both database dialects via the AiEmbeddingRepository port — there is no external vector database on either.
| Dialect | Storage | Similarity |
|---|---|---|
| PostgreSQL | pgvector extension; vector column in the ai schema |
Cosine distance computed in SQL (HNSW-indexed). |
| SQLite | Float32 BLOB column (packed bytes) |
Cosine similarity computed in application code. |
The frugal default — SQLite + Ollama — supports RAG. SQLite stores vectors as packed Float32 BLOBs and computes cosine similarity in application code, normalized to match the Postgres contract. The response envelope and each result's shape (agentName, sourceRef, content, similarity) are identical across dialects, so callers never branch on the storage engine.
Knowledge Sources
An agent's knowledge block defines the input data sources embedded into its knowledge base. Both source types are optional and can be combined.
Table Knowledge
Embed specified fields from a table. Only text-like fields (single-line-text, long-text, rich-text, markdown) should be embedded. An optional filter limits which rows are included.
agents:
- name: support-agent
role: support
systemPrompt: Answer using the FAQ and published docs.
knowledge:
tables:
- { table: faq, fields: [question, answer] }
- { table: docs, fields: [content], filter: { status: published } }
| Property | Description |
|---|---|
table |
Table name to embed (must reference a table in app.tables). |
fields |
Field names to include in embeddings (at least one). |
filter |
Optional key-value equality filter selecting which rows are embedded. |
When source records change, embeddings are updated automatically (auto-sync).
Document Knowledge
Embed document files (PDF, Markdown, plain text) discovered in the knowledge directory.
knowledge:
documents:
- { path: /knowledge/product-manual.pdf, label: Product Manual }
| Property | Description |
|---|---|
path |
File path to the document. |
label |
Optional human-readable label for the source. |
Documents placed in AI_KNOWLEDGE_DIR are natively discovered, parsed, chunked, embedded, and stored.
| Format | Extension | Notes |
|---|---|---|
.pdf |
Text-only; scanned PDFs not supported. | |
| Markdown | .md |
Formatting stripped, structure kept. |
| Plain text | .txt |
Ingested directly. |
Per-Agent Knowledge Scoping
Each agent has its own isolated knowledge base, keyed by agent name. One agent never retrieves another agent's embeddings. Chat can additionally access knowledge scoped to the requesting user's permissions. This isolation lets a support agent and a sales agent embed entirely different corpora without cross-contamination.
Embedding & Retrieval Configuration
Chunking, embedding, and retrieval are tuned via environment variables.
| Variable | Description | Default |
|---|---|---|
AI_EMBEDDING_MODEL |
Embedding model to use. | Provider's default embedding model |
AI_EMBEDDING_DIMENSIONS |
Vector dimensions (must match the model output). | Auto-detected from the model |
AI_KNOWLEDGE_DIR |
Path to the knowledge documents folder. | ./knowledge |
AI_RAG_CHUNK_SIZE |
Characters per chunk. | 512 |
AI_RAG_CHUNK_OVERLAP |
Overlap between adjacent chunks. | 50 |
AI_RAG_SIMILARITY |
Minimum cosine similarity to retain a result (0–1). | 0.7 |
AI_RAG_MAX_RESULTS |
Maximum chunks returned per query. | 5 |
Rebuild & Search APIs
| Endpoint | Purpose |
|---|---|
POST /api/ai/rag/rebuild |
Re-embeds the configured agent/document knowledge and persists vectors. Admin-only when app.auth is configured. |
POST /api/ai/rag/search |
Embeds a query, runs similarity search (optionally scoped by agent), filters by the similarity threshold, and returns ranked results. |
Both endpoints behave identically across PostgreSQL and SQLite — same authorization, same response shape.
Related Pages
- AI Overview — the full AI ecosystem.
- AI Agents — agents that own knowledge bases.
- AI Memory — runtime knowledge retrieval vs. embedded sources.
- AI Providers — embedding model configuration.
- Environment Variables — full
AI_RAG_*reference.