Memory Cortex¶
Memory Cortex is an advanced memory layer that sits on top of Long-Term Memory. While basic long-term memory retrieves relevant text chunks by similarity, the cortex understands what happened — tracking characters, relationships, emotional beats, and narrative arcs across your entire conversation.
What Does It Add?¶
| Feature | Long-Term Memory | Memory Cortex |
|---|---|---|
| Retrieve relevant past passages | Yes | Yes |
| Track named characters, locations, factions | - | Yes |
| Map relationships between entities | - | Yes |
| Score narrative importance (salience) | - | Yes |
| Detect emotional tones | - | Yes |
| Summarize story arcs | - | Yes |
| Attribute font colors to characters | - | Yes |
Memory Cortex doesn't replace long-term memory — it enhances it. With the cortex enabled, retrieved memories are ranked using narrative importance, emotional resonance, and entity relevance in addition to raw text similarity.
Getting Started¶
1. Enable the Cortex¶
Open Settings > Memory Cortex and flip the master toggle on.
2. Pick a Preset¶
Three presets configure the cortex for different use cases:
| Preset | Entity Tracking | Salience Scoring | Consolidation | Sidecar LLM | Best For |
|---|---|---|---|---|---|
| Simple | Heuristic | Heuristic | Off | Off | Casual chats, low overhead |
| Standard | Heuristic | Heuristic | On | Off | Most roleplay (recommended) |
| Advanced | Heuristic + LLM | Heuristic + LLM | On | On | Long epics, maximum accuracy |
Start with Standard
Standard gives you entity tracking, salience scoring, and consolidation using zero-cost heuristics — no extra API calls. You can always upgrade to Advanced later.
3. Rebuild Existing Chats¶
If you enable the cortex on a chat that already has history, click Rebuild in the Memory Cortex settings. This processes all existing chunks through the cortex pipeline. New messages are processed automatically going forward.
How It Works¶
Every time a message is sent, the cortex processes the corresponding chunk through several layers:
Salience Scoring¶
Each chunk gets a narrative importance score (0.0 to 1.0) based on:
- Emotional signals — grief, joy, tension, intimacy, betrayal, and more
- Narrative flags — first meetings, deaths, promises, confessions, departures
- Dialogue content — commitments, revelations, emotional declarations
- Character actions — named characters doing emotionally significant things
- Milestone markers — "for the first time", "nothing would be the same"
- Information density — scenes with many proper nouns and new facts
High-salience memories resist decay over time. Pivotal moments (score above 0.7 or carrying narrative flags like death or promise) are protected as core memories — they decay 5x slower and never drop below a 0.5 retrieval floor.
Entity Tracking¶
The cortex extracts and tracks named entities from your chat:
- Characters — detected by verb adjacency ("Melina sighed"), dialogue attribution, interaction patterns
- Locations — detected by suffixes ("Sixth Street"), locative phrases ("arrived at Dustwell")
- Factions — detected by collective nouns ("Sons of Calydon"), business suffixes ("PubSec")
- Items — detected by weapon/vehicle verbs ("wielding the Starblade")
Each entity accumulates facts, emotional associations, and a salience profile over time. The entity graph handles aliases automatically — if a character named "Pulchra Fellini" is sometimes called "Pulchra" or "Pul", those references are resolved to the same entity.
You can browse entities in the Memory panel (sidebar > Memory > Entities tab). Delete any that were incorrectly extracted.
Relationship Mapping¶
When two named entities appear in the same chunk, the cortex analyzes their interaction:
- Verb-mediated — "Melina protected Caesar" (ally, positive sentiment)
- Relational nouns — "Melina's brother" near "Caesar" (sibling)
- Coordinated action — "Melina and Caesar fought together" (ally)
- Terms of address — endearments or hostile language in dialogue
- Physical proximity — two characters described near each other
Relationships are reinforced each time they're observed, building a strength score over time.
Consolidation¶
As your chat grows, older chunks are compressed into summaries:
- Scene summaries (Tier 1) — groups of chunks consolidated into a single paragraph capturing key events
- Story arcs (Tier 2) — groups of scene summaries compressed into high-level arc descriptions
Consolidation triggers automatically when enough unconsolidated chunks accumulate (configurable threshold). This keeps the memory footprint bounded while preserving narrative continuity.
You can view consolidations in the Memory panel (sidebar > Memory > Stats > Consolidations).
Emotional Recall¶
When you generate a new message, the cortex analyzes the emotional tone of recent messages and boosts retrieval of memories with matching emotions. A sad scene naturally surfaces memories of past grief and loss. A tense confrontation recalls previous conflicts.
This "Proustian recall" works alongside semantic similarity — memories that are both topically relevant and emotionally resonant score highest.
Sidecar LLM (Tier 2)¶
For maximum accuracy, you can assign a secondary LLM connection to assist the cortex. This sidecar model handles:
- Deeper entity extraction — catches entities the heuristic misses
- Better relationship detection — understands implied relationships
- Calibrated salience scoring — judges narrative importance by consequence, not just keyword presence
- Font color attribution — identifies which character owns each HTML color tag
- Key fact extraction — pulls concrete, memorable facts from each passage
- Generative consolidation — produces coherent narrative summaries instead of sentence extractions
Setting Up a Sidecar¶
- In Memory Cortex settings, select a Connection Profile under the Sidecar section
- Choose a Model (smaller, faster models work well here — the sidecar doesn't need to be creative)
- Adjust Temperature (0.1 recommended for factual extraction)
- Set Parallel Requests to control how many concurrent LLM calls run during a rebuild
Sidecar Costs
The sidecar makes one LLM call per chunk during live chat, and one per chunk during rebuilds. A chat with 200 chunks would make 200 API calls on rebuild. Choose an inexpensive model for the sidecar to keep costs reasonable.
The sidecar results are merged with heuristic results — the heuristic always runs as a baseline, and the LLM supplements it. If the sidecar call fails for any reason, the heuristic result is used as a fallback.
Memory Panel¶
The sidebar's Memory tab gives you a live view of the cortex data for the current chat:
Entities Tab¶
Browse all tracked entities — characters, locations, items, factions. Each entity card shows:
- Type and status (active, inactive, deceased)
- Mention count and salience average
- Description (auto-populated from first appearance)
- Known facts
- Emotional profile (top emotional associations)
- Aliases
You can delete incorrectly extracted entities directly from this panel.
Colors Tab¶
Shows font color attributions — which hex color belongs to which character, with confidence scores. Useful for chats where characters use distinct colors for speech, thought, or narration.
Stats Tab¶
Overview of the cortex data:
- Memory chunks (total and vectorized)
- Entities (active and archived)
- Relations between entities
- Consolidations (scene summaries and arcs)
- Salience records
Click any stat card to drill down into the raw records.
Macros¶
Memory Cortex data is available in your presets through macros. Add these via Add Prompt > Memory Cortex in the preset editor:
| Macro | Returns |
|---|---|
{{entities}} |
Active entity snapshots with facts and relationships |
{{entityFacts::Name}} |
Facts about a specific entity (e.g., {{entityFacts::Melina}}) |
{{relationships}} |
Active relationship edges between entities |
{{arc}} |
Current narrative arc summary |
{{memorySalience}} |
The single highest-importance memory from retrieval |
{{cortexActive}} |
"yes" or "no" for conditional blocks |
{{entityCount}} |
Number of entities in the current context |
{{characterColors}} |
Character speech / thought / narration color guidance derived from cortex state |
The standard memory macros ({{memories}}, {{memoriesRaw}}, etc.) continue to work alongside cortex macros. When the cortex is enabled, {{memories}} returns cortex-enhanced results formatted in shadow-prompt style.
Configuration Reference¶
Formatter Mode¶
Controls how retrieved memories are formatted for the prompt:
| Mode | Style |
|---|---|
| Shadow | Prose-register context with "do not recite" instructions (default) |
| Attributed | Each memory labeled with source and salience |
| Clinical | Bullet-point factual summaries |
| Minimal | Raw content, minimal formatting |
Decay Settings¶
Memories lose relevance over time through a decay function:
| Setting | Description |
|---|---|
| Half-Life (turns) | After this many messages, a memory's recency score halves |
| Reinforcement Weight | How much retrieval boosts a memory's score (prevents useful memories from decaying) |
| Core Memory Threshold | Salience score above which a memory becomes a protected "core memory" |
| Core Memory Flags | Narrative flags that automatically mark a memory as core (e.g., death, promise) |
Entity Pruning¶
Keeps the entity graph bounded:
| Setting | Description |
|---|---|
| Enabled | Toggle automatic pruning |
| Stale After (messages) | Entities not seen for this many messages get archived |
| Min Confidence | Minimum extraction confidence to create new entities |
Protected Terms¶
The entity whitelist lets you specify proper nouns that should always be recognized as entities, even if they look like common words. Useful for fantasy names that might be filtered out.
Tips¶
Rebuild after changing settings
If you change chunking parameters or enable the sidecar, click Rebuild to reprocess all chunks. The cortex detects stale data automatically on the next generation, but a manual rebuild ensures immediate freshness.
Delete bad entities early
If you spot an incorrectly extracted entity (like a common word being tracked as a character), delete it from the Memory panel. This prevents it from accumulating relationships and facts that pollute the graph.
Pair with Loom Summary
The cortex excels at granular recall (specific moments, entity facts, relationships). Loom Summary provides a structured overview of the whole story. Use both for comprehensive long-term coherence.
Sidecar model selection
For the sidecar, prioritize speed and cost over creativity. Models like Gemini Flash, Haiku, or GPT-4o-mini work well — the sidecar does structured extraction, not creative writing.
Check the Stats tab
The Stats tab shows whether salience records are sourced from "heuristic" or "sidecar". After a rebuild with a sidecar configured, you should see "sidecar" entries. If everything still shows "heuristic", check that your sidecar connection is configured correctly.