Embeddings & Vector Search¶

Embeddings power two features in Lumiverse: semantic world book activation (finding lorebook entries by meaning, not just keywords) and long-term chat memory (recalling relevant past moments). Both require an embedding provider to be configured.

What Are Embeddings?¶

An embedding is a numerical representation of text — a list of numbers that captures the meaning of a passage. Similar texts produce similar embeddings. This lets Lumiverse find relevant content based on what it means, not just whether exact keywords match.

Without embeddings: World book entries activate only on keyword matches. Chat history outside the context window is lost.

With embeddings: World book entries can activate on semantically similar concepts. Past conversation moments can be recalled based on relevance.

Setting Up¶

Open Settings > Embeddings and follow the setup checklist:

1. Enable Embeddings¶

Toggle the master switch on.

2. Select a Provider¶

Provider	Notes
OpenAI	Official OpenAI API (`text-embedding-3-small` recommended)
OpenAI Compatible	Any service implementing the OpenAI embeddings API (local models, self-hosted)
OpenRouter	Aggregation service
ElectronHub	Model aggregator
Nano-GPT	Pay-per-token aggregator

3. Configure the Connection¶

Field	Description
API URL	Base URL for the provider. Auto-appends `/v1/embeddings` if no path is specified.
Embedding Model	Model name (e.g., `text-embedding-3-small`)
API Key	Your provider's authentication key
Dimensions	Vector size — auto-detected when you run a test
Send Dimensions	Whether to include the dimension value in API requests (some providers require it, others reject it)

4. Test the API¶

Click Test API to verify your setup. A successful test auto-detects the model's native dimensions and applies them.

What Gets Vectorized¶

Enable vectorization for the content types you want:

Content	Setting	What It Does
World Book Entries	`vectorize_world_books`	Enables semantic search for lorebook entries — activates entries by meaning, not just keywords
Chat Messages	`vectorize_chat_messages`	Enables long-term memory — recalls relevant past messages during generation
Chat Documents	`vectorize_chat_documents`	Indexes documents attached to chats

Retrieval Settings¶

Vector Recall Size (Top-K)¶

How many vector matches to retrieve per query. Higher values cast a wider net but use more tokens.

4 — Focused retrieval (default)
8-12 — Broad retrieval for complex stories

Similarity Threshold¶

Maximum cosine distance for matches. Lower values = stricter matching.

0 — No filtering (accept all matches)
0.3-0.5 — Moderate filtering
0.8+ — Very strict (only highly similar content)

Cosine distance can exceed 1.0 in LanceDB's implementation, so this isn't capped at 1.

Rerank Cutoff¶

For world book vectors: minimum score required after boost/penalty adjustments. Helps filter out low-quality matches after post-processing.

Hybrid Weight¶

Controls the balance between traditional keyword matching and semantic vector search:

Mode	Behavior
Keyword First	Prioritize exact word matches; use vectors as a tiebreaker
Balanced	Weight both methods equally (recommended)
Vector First	Prioritize semantic similarity; keywords are secondary

Batch Processing¶

Setting	Description
Batch Size	Entries per API request during reindexing (1-200, default 50)
Preferred Context Size	Recent messages used to build the search query (default 6)

Tips¶

Start with OpenAI's small model

text-embedding-3-small is cheap, fast, and effective. It's the best starting point for most users.

Enable world book vectorization first

Semantic world book search is the highest-impact use of embeddings. Long-term memory is valuable too, but world book vectorization gives immediate improvement with less configuration.

Test after setup

Always click Test API after configuration. This verifies your credentials work and auto-detects the correct dimensions — getting dimensions wrong produces garbage results.