Vector Databases

Vector databases enable semantic search — instead of searching by exact word matching, the system understands text meaning and finds similar content.

How Does It Work?

Embedding — each text is converted to a numerical vector (1536 dimensions) by an AI model
Storage — vectors are stored in the database along with the original text and metadata
Search — the user’s question is converted to a vector and compared with stored vectors (cosine distance)
Results — the system returns the best matching entries

Use Cases

RAG (Retrieval-Augmented Generation) — enriching AI responses with context from the knowledge base
Semantic search — finding similar documents, articles, FAQ
Chat with knowledge base — ask a question, AI answers based on your documents

Requirements

A vector database requires an AI connector (OpenAI, Gemini, or Claude) with embedding support for generating vectors.

Entries

Each entry in the vector database contains:

Text content
Embedding vector
Metadata (e.g., source URL)
Source association (type + ID, e.g., Kb::Entry #35)
Chunk number (when text was split)

Text Chunking

Long texts are automatically split into smaller fragments (chunks) before generating embeddings. Each chunk is a separate entry in the vector database — but all chunks from one source (e.g., a KB entry) are linked together.

“Chunking enabled” Option

In the vector database settings, you can enable the chunking option. This changes behavior:

Setting	Chunk size	Effect
Chunking off	model’s max tokens (e.g., 8191 for OpenAI)	Text split only when exceeding model limit. Larger fragments, fewer entries
Chunking on	~500 tokens (~1-2 paragraphs)	Text always split into small fragments. More precise search

When to Enable Chunking

Enable when the source has long documents (articles, regulations, documentation) and you need search precision — a small chunk matches a specific question better
Leave off when entries are short (FAQ, single questions/answers) — splitting short texts doesn’t make sense

How Splitting Works

The system recognizes text structure — Markdown headings (## Section), paragraphs, HTML lists
A new section (heading) is a natural chunk boundary
Each chunk gets a prefix with the section heading it belongs to — so it doesn’t lose context
If text is HTML — the system converts it to structured text preserving headings and paragraphs
Tokens counted exactly by tiktoken (OpenAI tokenizer) — not guessing by characters

Per-model Limits

Each embedding model has a different token limit per call. The system automatically retrieves the limit from the connector:

Model	Max tokens	Effect with chunking OFF	Effect with chunking ON
OpenAI text-embedding-3-small	8,191	chunks up to ~7,800 tokens	chunks up to 500 tokens
Cohere embed-v4 (Bedrock)	128,000	practically no splitting	chunks up to 500 tokens
Gemini embedding	2,048	chunks up to ~1,900 tokens	chunks up to 500 tokens

When switching connectors (e.g., from OpenAI to Cohere), limits adjust automatically — no need to change anything in the database settings.

Organization

Mail

Helpdesk

WebChat

CRM

Knowledge Base

Forms

CMS

Account

Calendar

Noe

VoIP

Billing

Drive

Feedback

Vector Databases

Go to section