Free Beta
Home / Docs / Memory System

Memory System

Persistent Memory · Per-Character · Full-Text Search

Overview

TokForge remembers things about you across conversations. Instead of repeating yourself, your AI remembers your preferences, goals, constraints, and the knowledge you share with it. All memory stays on your device — nothing is sent to the cloud. The system automatically deduplicates similar facts, extracts new memories from conversations, and uses semantic embeddings to find relevant memories even when you phrase things differently.

The memory system stores three types of information per character:

Memory Facts

Individual pieces of information like "prefers formal language", "is a software engineer", or "lives in Austin". Facts can be manually added by you, automatically extracted from conversations, or imported. Important facts are always included in the AI's context window, so they're never forgotten.

Knowledge Graph

Structured relationships between entities. When you say "I like coffee", TokForge creates a connection: You → prefers → coffee. These relationships build up over time into a web of knowledge about you. The more you mention something, the stronger the connection becomes.

Reference Documents

Import text documents as background knowledge for a character. Perfect for giving a character context about a D&D campaign setting, a project brief, your personal notes, or anything else that should be part of their understanding about you.

Memory Facts

Facts are the core of TokForge's memory. They're individual snippets of information about you, your preferences, and your context.

What Gets Stored

Examples of facts TokForge might learn or store:

  • Preferences: "prefers coffee over tea", "likes structured agendas"
  • Identity: "is a software engineer", "lives in Austin, Texas"
  • Goals: "learning Japanese", "wants to build a side project"
  • Constraints: "avoids dairy", "cannot work past 6pm"
  • Relationships: "close friends with Alex", "works with Sarah"
  • Issues: "experienced a bug with model loading", "having trouble with authentication"

How Facts Are Created

Automatic extraction: When you chat naturally with a character, TokForge listens for memorable information and extracts facts automatically. No work required.

Manual addition: Open the Memory screen and add facts directly via the "+" button. These facts are automatically pinned so they won't be removed.

Import: Bring in facts from external sources if needed.

Deduplication

TokForge automatically detects and removes duplicate facts to keep your memory clean and focused. The system uses multi-layer deduplication: keyword-based matching catches obvious duplicates, while semantic similarity (using MiniLM embeddings) catches subtle near-duplicates like "enjoys hiking" vs. "likes mountain trails". This prevents memory bloat from redundant information.

Importance Scoring

TokForge automatically scores facts by importance (0-10 scale). Facts about critical preferences, constraints, or goals score higher. More important facts are always kept in the AI's context window, ensuring they're never forgotten. Less important facts that haven't been used in a while are automatically cleaned up after 90 days (though you can pin important ones to protect them).

Managing Facts

  • Create facts: Manually add facts for a character via the Memory screen or API
  • Pin facts: Protect important facts from auto-cleanup. Pinned facts always stay in memory.
  • Archive facts: Hide old facts instead of deleting them. Archived facts won't be used in conversations but are kept for your records.
  • Search facts: Use full-text search to find any fact by keyword.
  • Edit facts: Update fact wording, importance score, or category anytime.
  • Hard delete facts: Permanently remove facts you no longer need.

Knowledge Graph

The knowledge graph is a structured map of relationships extracted from your facts and conversations. It captures who knows whom, what you prefer, where you live, what you own, and more.

How It Works

When you tell a character something like:

  • "I live in New York" → You → lives_in → New York
  • "I'm a software engineer" → You → works_as → software engineer
  • "I like coffee" → You → prefers → coffee
  • "Sarah is my colleague" → You → works_with → Sarah

These relationships strengthen over time. If you mention coffee five times, the "prefers coffee" connection gets stronger. The AI sees these patterns and uses them to better understand your preferences and context.

Viewing the Knowledge Graph

The Memory screen includes a Knowledge tab where you can:

  • Browse relationships in list view: See all "You prefers X", "You lives in Y", etc., organized by subject and entity
  • Delete relationships: Remove individual edges from the knowledge graph via the API or UI
  • Explore connections: See which relationships boost fact relevance during retrieval

Document Import (RAG)

Import text documents as reference material for a character. Documents are automatically chunked, embedded, and indexed for instant semantic retrieval during conversations. This is a form of Retrieval-Augmented Generation (RAG) — the AI retrieves relevant sections from your documents and uses them as context.

Use Cases

  • D&D campaigns: Import campaign setting descriptions, lore, NPC backstories, world rules
  • Project briefs: Share project requirements, design docs, architecture notes, specifications
  • Personal knowledge: Import family history, travel notes, personal reference materials, notes
  • Technical docs: Add API documentation, code libraries, software guides, tutorials

How to Import

  1. Open the Memory screen for a character
  2. Go to the Documents tab
  3. Tap "Import Document" and select a text file
  4. The document is automatically chunked into searchable segments and embedded for semantic retrieval

How Documents Are Retrieved

When you chat, TokForge automatically searches your imported documents for relevant content using the same hybrid retrieval approach as memory facts. Matching document chunks are included in the AI's context, giving it access to your reference material without needing you to manually paste it into each conversation.

Managing Documents

  • View document title, chunk count, and total characters
  • Delete documents anytime without affecting other memory
  • Each character has its own document library — documents don't appear in other characters' conversations

Memory Screen UI

The dedicated Memory screen is where you browse, search, and manage all memory for a character.

Three Tabs

Facts Tab: Browse all your memory facts. Pinned facts appear at the top. Search by keyword, add new facts with the floating "+" button, or tap a fact to edit it. Long-press any fact to toggle pin status. Archive old facts when you're done with them.

Knowledge Tab: View the relationship graph. Switch between list view (organized by subject/entity) and graph view (visual network). In graph view, tap nodes to see connected relationships. All relationships are automatically extracted from facts and conversations.

Documents Tab: Manage imported reference documents. See title, chunk count, and size. Delete documents individually when no longer needed.

Character Selection

At the top of the Memory screen, select which character's memory you want to view. Each character has its own separate memory space — facts, knowledge graphs, and documents are completely isolated.

Search

Use the search field to find facts by keyword. Search is instant and works across all your facts using full-text search. Search is case-insensitive and matches on any part of fact text.

How Memory Works in Chat

Memory is automatic. When you chat, TokForge is always working in the background:

  1. Relevant memories are retrieved: When you send a message, TokForge automatically finds memories most relevant to your conversation using a blend of keyword search and semantic similarity
  2. Memories are included in context: The AI sees your top facts, knowledge relationships, and relevant documents in its context window
  3. Conversations feel continuous: Instead of starting fresh each time, it feels like you're picking up where you left off — especially in multi-turn conversations which now load dramatically faster because TokForge only processes new messages instead of replaying your entire history
  4. New memories are extracted: As you chat, TokForge continuously listens for new memorable information to store

All of this happens locally on your device. No memory data ever leaves your phone or tablet.

Hybrid Retrieval

TokForge uses a hybrid retrieval system that combines full-text keyword search with meaning-based semantic similarity to find the most relevant memories for your conversation.

How It Works

Instead of relying on keywords alone, TokForge blends two approaches:

  • Full-text keyword search: Uses FTS4 indexing to instantly find facts containing the exact words you're talking about
  • Semantic similarity search: Uses the MiniLM embedding model to find facts that are conceptually related to your message, even if they don't share the same words

Knowledge graph relationships provide an additional relevance boost when entities mentioned in your message are found in the graph.

For example, if you ask "Can I have coffee at night?", the system finds facts containing "coffee" and also related facts like "prefers to sleep early" or "caffeine sensitive", which are semantically relevant to your question.

Automatic: This hybrid approach works seamlessly in the background. The most relevant facts (regardless of whether they matched via keywords or semantic similarity) are always selected for your AI's context, resulting in more accurate and contextually appropriate responses.

Background Memory Extraction ("Reflect")

TokForge can automatically extract facts, summaries, and character impressions from your conversations while your app is idle, without requiring you to do anything.

How It Works

When your device is idle and has sufficient battery, TokForge processes your recent conversations in the background, extracting memorable facts, conversation summaries, and updated character impressions. This "reflection" process happens quietly without impacting your device's performance or battery life. The system uses semantic embeddings (MiniLM model) to understand meaning, which are automatically downloaded on first use.

Model Requirements

Background reflection requires a model with at least 1.7B parameters. Smaller models may not extract meaningful facts reliably, so reflection is automatically skipped for them. This ensures quality of extracted information.

Reflected Facts

Facts extracted through background reflection are marked with a "Reflected" badge in the Memory screen, so you can see which facts came from this automated process. You can edit, delete, or archive these facts just like any other fact. Reflected facts also include conversation summaries that capture the essence of each conversation for temporal context.

Enable/Disable Reflection

You control whether background extraction is enabled. Go to Settings → Optimize and toggle "Background Memory Extraction" on or off. By default it's enabled to help you build richer memory without extra effort.

Battery & Device Constraints

Background reflection only runs when your device is idle and has adequate battery (20% or higher). This ensures the process doesn't drain your battery or interfere with your active work.

Character Impressions

As you chat with a character over time, it forms an evolving impression of who you are and what you care about. These impressions are displayed on the character's detail screen, giving you insight into how the character understands you.

What Are Impressions?

Impressions are the character's synthesized understanding of you based on all your conversations and memory facts. They go beyond individual facts to capture broader patterns and themes — how the character "views" you through their lens. For example:

  • "I think you value clear communication and structure"
  • "I see you're interested in technology and problem-solving"
  • "You seem to prefer efficient, direct interactions"
  • "I notice you're creative and enjoy exploring new ideas"

How They Update

Impressions are periodically updated based on your conversations, allowing the character's perception of you to evolve over time. The more you interact with a character, the more refined and accurate their impressions become. This creates a deeper, more personalized relationship as the character learns what matters to you.

View Impressions

Open any character's detail screen to see their current impression of you. These impressions help explain why the character responds to you the way they do — it's a direct reflection of what they've learned about you through your interactions.

Managing Your Memory

Configurable Limits

You can customize how much memory is injected into conversations. Go to Settings and adjust the "Max Memory Facts" slider (0-30, default 10). This controls the maximum number of facts included in the AI's context for each turn.

Pinning Facts

Pin the facts that matter most. Pinned facts:

  • Are always included in the AI's context
  • Never get auto-deleted during cleanup
  • Appear at the top of the Facts list

Manually added facts are automatically pinned by default.

Archiving Facts

Archive facts instead of deleting them. Archived facts:

  • Won't be used in conversations
  • Are hidden from the main Facts list
  • Are kept for your records
  • Can be unarchived anytime

Automatic Cleanup

TokForge automatically cleans up stale, low-importance facts that are older than 90 days. Cleanup protects you from:

  • Memory bloat from forgotten snippets
  • Outdated information cluttering context
  • Inconsistent AI behavior from conflicting facts

Protected facts never get deleted: Pinned facts and high-importance facts are always protected, no matter how old they are.

Per-Character Isolation

Each character has completely separate memory. You can have very different personas with different facts, preferences, and knowledge. Switch characters in the Memory screen to manage each one independently.

Privacy & Storage

Local Storage

All memory data — facts, knowledge edges, documents — is stored locally in TokForge's encrypted database on your device. Nothing is sent to TokForge servers or the cloud.

Persistence

Memory survives app restarts, device reboots, and everything in between. Once a fact is stored, it's yours permanently (until you delete it or it's auto-cleaned).

Data Control

You have complete control over your memory:

  • Delete individual facts anytime
  • Clear all memory for a character at once
  • Export your memory data for backup or analysis
  • Archive instead of delete if you want to keep records

No Cloud Sync

Memory is entirely device-local. There's no cloud sync, cloud backup, or any transmission of memory data outside your device. If you switch devices, you'll need to rebuild memory on the new device, but your original device data remains untouched.

API Access

For advanced users and integrations, memory can be managed programmatically via REST endpoints. TokForge v3.4.7 provides 13+ dedicated memory API endpoints covering facts, knowledge graphs, documents, search, and statistics. v3.4.0 also introduced Document RAG with RAPTOR tree summarization, BGE-small embeddings, and semantic chunk search for attached documents.

Available Operations

Memory Facts (CRUD):

  • Create facts for a character
  • Retrieve all facts for a character
  • Search facts by keyword or relevance
  • Update fact content, importance, or category
  • Pin and unpin facts
  • Archive and unarchive facts
  • Delete facts individually or in bulk

Knowledge Graph:

  • Retrieve all knowledge edges (relationship triples) for a character
  • Delete individual edges from the knowledge graph

Reference Documents:

  • Import and list documents for a character
  • Delete documents

Search & Retrieval:

  • Trigger hybrid memory retrieval for a user message

Statistics & Maintenance:

  • Get memory statistics (fact count, storage usage)
  • Trigger manual stale fact cleanup

Authentication

All memory API endpoints require Bearer token authentication. Get your token from Settings → Advanced → Metrics Server → Metrics Auth Token.

Example: List all facts for a character via API:
curl -H "Authorization: Bearer YOUR_TOKEN" \
     http://localhost:8088/memory/facts?character_id=1

See the API documentation for complete endpoint reference, request/response formats, and code examples.