Skip to content

Embeddings & Vector Search

Embeddings power two features in Lumiverse: semantic world book activation (finding lorebook entries by meaning, not just keywords) and long-term chat memory (recalling relevant past moments). Both require an embedding provider to be configured.


What Are Embeddings?

An embedding is a numerical representation of text — a list of numbers that captures the meaning of a passage. Similar texts produce similar embeddings. This lets Lumiverse find relevant content based on what it means, not just whether exact keywords match.

Without embeddings: World book entries activate only on keyword matches. Chat history outside the context window is lost.

With embeddings: World book entries can activate on semantically similar concepts. Past conversation moments can be recalled based on relevance.


Setting Up

Open Settings > Embeddings and follow the setup checklist:

1. Enable Embeddings

Toggle the master switch on.

2. Select a Provider

Provider Notes
OpenAI Official OpenAI API (text-embedding-3-small recommended)
OpenAI Compatible Any service implementing the OpenAI embeddings API (local models, self-hosted)
OpenRouter Aggregation service
ElectronHub Model aggregator
Nano-GPT Pay-per-token aggregator

3. Configure the Connection

Field Description
API URL Base URL for the provider. Auto-appends /v1/embeddings if no path is specified.
Embedding Model Model name (e.g., text-embedding-3-small)
API Key Your provider's authentication key
Dimensions Vector size — auto-detected when you run a test
Send Dimensions Whether to include the dimension value in API requests (some providers require it, others reject it)

4. Test the API

Click Test API to verify your setup. A successful test auto-detects the model's native dimensions and applies them.


What Gets Vectorized

Enable vectorization for the content types you want:

Content Setting What It Does
World Book Entries vectorize_world_books Enables semantic search for lorebook entries — activates entries by meaning, not just keywords
Chat Messages vectorize_chat_messages Enables long-term memory — recalls relevant past messages during generation
Chat Documents vectorize_chat_documents Indexes documents attached to chats

Retrieval Settings

Vector Recall Size (Top-K)

How many vector matches to retrieve per query. Higher values cast a wider net but use more tokens.

  • 4 — Focused retrieval (default)
  • 8-12 — Broad retrieval for complex stories

Similarity Threshold

Maximum cosine distance for matches. Lower values = stricter matching.

  • 0 — No filtering (accept all matches)
  • 0.3-0.5 — Moderate filtering
  • 0.8+ — Very strict (only highly similar content)

Cosine distance can exceed 1.0 in LanceDB's implementation, so this isn't capped at 1.

Rerank Cutoff

For world book vectors: minimum score required after boost/penalty adjustments. Helps filter out low-quality matches after post-processing.


Hybrid Weight

Controls the balance between traditional keyword matching and semantic vector search:

Mode Behavior
Keyword First Prioritize exact word matches; use vectors as a tiebreaker
Balanced Weight both methods equally (recommended)
Vector First Prioritize semantic similarity; keywords are secondary

Batch Processing

Setting Description
Batch Size Entries per API request during reindexing (1-200, default 50)
Preferred Context Size Recent messages used to build the search query (default 6)

Tips

Start with OpenAI's small model

text-embedding-3-small is cheap, fast, and effective. It's the best starting point for most users.

Enable world book vectorization first

Semantic world book search is the highest-impact use of embeddings. Long-term memory is valuable too, but world book vectorization gives immediate improvement with less configuration.

Test after setup

Always click Test API after configuration. This verifies your credentials work and auto-detects the correct dimensions — getting dimensions wrong produces garbage results.