openclaw/docs/concepts/memory.md

---
summary: "How Moltbot memory works (workspace files + automatic memory flush)"
read_when:
  - You want the memory file layout and workflow
  - You want to tune the automatic pre-compaction memory flush
---
# Memory

Moltbot memory is **plain Markdown in the agent workspace**. The files are the
source of truth; the model only "remembers" what gets written to disk.

Memory search tools are provided by the active memory plugin (default:
`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.

## Memory files (Markdown)

The default workspace layout uses two memory layers:

- `memory/YYYY-MM-DD.md`
  - Daily log (append-only).
  - Read today + yesterday at session start.
- `MEMORY.md` (optional)
  - Curated long-term memory.
  - **Only load in the main, private session** (never in group contexts).

These files live under the workspace (`agents.defaults.workspace`, default
`~/clawd`). See [Agent workspace](/concepts/agent-workspace) for the full layout.

## When to write memory

- Decisions, preferences, and durable facts go to `MEMORY.md`.
- Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," write it down (do not keep it in RAM).
- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
- If you want something to stick, **ask the bot to write it** into memory.

## Automatic memory flush (pre-compaction ping)

When a session is **close to auto-compaction**, Moltbot triggers a **silent,
agentic turn** that reminds the model to write durable memory **before** the
context is compacted. The default prompts explicitly say the model *may reply*,
but usually `NO_REPLY` is the correct response so the user never sees this turn.

This is controlled by `agents.defaults.compaction.memoryFlush`:

```json5
{
  agents: {
    defaults: {
      compaction: {
        reserveTokensFloor: 20000,
        memoryFlush: {
          enabled: true,
          softThresholdTokens: 4000,
          systemPrompt: "Session nearing compaction. Store durable memories now.",
          prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
        }
      }
    }
  }
}
```

Details:
- **Soft threshold**: flush triggers when the session token estimate crosses
  `contextWindow - reserveTokensFloor - softThresholdTokens`.
- **Silent** by default: prompts include `NO_REPLY` so nothing is delivered.
- **Two prompts**: a user prompt plus a system prompt append the reminder.
- **One flush per compaction cycle** (tracked in `sessions.json`).
- **Workspace must be writable**: if the session runs sandboxed with
  `workspaceAccess: "ro"` or `"none"`, the flush is skipped.

For the full compaction lifecycle, see
[Session management + compaction](/reference/session-management-compaction).

## Vector memory search

Moltbot can build a small vector index over `MEMORY.md` and `memory/*.md` so
semantic queries can find related notes even when wording differs.

Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings by default. If `memorySearch.provider` is not set, Moltbot auto-selects:
  1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
  2. `openai` if an OpenAI key can be resolved.
  3. `gemini` if a Gemini key can be resolved.
  4. Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.

Remote embeddings **require** an API key for the embedding provider. Moltbot
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth only covers chat/completions and does **not** satisfy
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).

### QMD backend (experimental)

Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines
BM25 + vectors + reranking. Markdown stays the source of truth; Moltbot shells
out to QMD for retrieval. Key points:

**Prereqs**
- Disabled by default. Opt in per-config (`memory.backend = "qmd"`).
- Install the QMD CLI separately (`bun install -g github.com/tobi/qmd` or grab
  a release) and make sure the `qmd` binary is on the gateway’s `PATH`.
- QMD needs an SQLite build that allows extensions (`brew install sqlite` on
  macOS). The gateway sets `INDEX_PATH`/`QMD_CONFIG_DIR` automatically.

**How the sidecar runs**
- The gateway writes a self-contained QMD home under
  `~/.clawdbot/agents/<agentId>/qmd/` (config + cache + sqlite DB).
- Collections are rewritten from `memory.qmd.paths` (plus default workspace
  memory files) into `index.yml`, then `qmd update` + `qmd embed` run on boot and
  on a configurable interval (`memory.qmd.update.interval`, default 5 m).
- Searches run via `qmd query --json`. If QMD fails or the binary is missing,
  Moltbot automatically falls back to the builtin SQLite manager so memory tools
  keep working.

**Config surface (`memory.qmd.*`)**
- `command` (default `qmd`): override the executable path.
- `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`.
- `paths[]`: add extra directories/files (`path`, optional `pattern`, optional
  stable `name`).
- `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`,
  `exportDir`).
- `update`: controls refresh cadence (`interval`, `debounceMs`, `onBoot`).
- `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`,
  `maxInjectedChars`, `timeoutMs`).
- `scope`: same schema as [`session.sendPolicy`](/reference/configuration#session-sendpolicy).
  Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD
  hits in groups/channels.
- Snippets sourced outside the workspace show up as
  `qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get`
  understands that prefix and reads from the configured QMD collection root.
- When `memory.qmd.sessions.enabled = true`, Moltbot exports sanitized session
  transcripts (User/Assistant turns) into a dedicated QMD collection under
  `~/.clawdbot/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent
  conversations without touching the builtin SQLite index.
- `memory_search` snippets now include a `Source: <path#line>` footer when
  `memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep
  the path metadata internal (the agent still receives the path for
  `memory_get`, but the snippet text omits the footer and the system prompt
  warns the agent not to cite it).

**Example**

```json5
memory: {
  backend: "qmd",
  citations: "auto",
  qmd: {
    includeDefaultMemory: true,
    update: { interval: "5m", debounceMs: 15000 },
    limits: { maxResults: 6, timeoutMs: 4000 },
    scope: {
      default: "deny",
      rules: [{ action: "allow", match: { chatType: "direct" } }]
    },
    paths: [
      { name: "docs", path: "~/notes", pattern: "**/*.md" }
    ]
  }
}
```

**Citations & fallback**
- `memory.citations` applies regardless of backend (`auto`/`on`/`off`).
- When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which
  engine served the results. If the QMD subprocess exits or JSON output can’t be
  parsed, the search manager logs a warning and returns the builtin provider
  (existing Markdown embeddings) until QMD recovers.

### Additional memory paths

If you want to index Markdown files outside the default workspace layout, add
explicit paths:

```json5
agents: {
  defaults: {
    memorySearch: {
      extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"]
    }
  }
}
```

Notes:

- Paths can be absolute or workspace-relative.
- Directories are scanned recursively for `.md` files.
- Only Markdown files are indexed.
- Symlinks are ignored (files or directories).

### Gemini embeddings (native)

Set the provider to `gemini` to use the Gemini embeddings API directly:

```json5
agents: {
  defaults: {
    memorySearch: {
      provider: "gemini",
      model: "gemini-embedding-001",
      remote: {
        apiKey: "YOUR_GEMINI_API_KEY"
      }
    }
  }
}
```

Notes:
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
- `remote.headers` lets you add extra headers if needed.
- Default model: `gemini-embedding-001`.

If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
you can use the `remote` configuration with the OpenAI provider:

```json5
agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      remote: {
        baseUrl: "https://api.example.com/v1/",
        apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
        headers: { "X-Custom-Header": "value" }
      }
    }
  }
}
```

If you don't want to set an API key, use `memorySearch.provider = "local"` or set
`memorySearch.fallback = "none"`.

Fallbacks:
- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
- The fallback provider is only used when the primary embedding provider fails.

Batch indexing (OpenAI + Gemini):
- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.

Why OpenAI batch is fast + cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
  - https://platform.openai.com/docs/api-reference/batch
  - https://platform.openai.com/pricing

Config example:

```json5
agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      fallback: "openai",
      remote: {
        batch: { enabled: true, concurrency: 2 }
      },
      sync: { watch: true }
    }
  }
}
```

Tools:
- `memory_search` — returns snippets with file + line ranges.
- `memory_get` — read memory file content by path.

Local mode:
- Set `agents.defaults.memorySearch.provider = "local"`.
- Provide `agents.defaults.memorySearch.local.modelPath` (GGUF or `hf:` URI).
- Optional: set `agents.defaults.memorySearch.fallback = "none"` to avoid remote fallback.

### How the memory tools work

- `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.
- `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected.
- Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.

### What gets indexed (and when)

- File type: Markdown only (`MEMORY.md`, `memory/**/*.md`).
- Index storage: per-agent SQLite at `~/.clawdbot/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).
- Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.
- Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, Moltbot automatically resets and reindexes the entire store.

### Hybrid search (BM25 + vector)

When enabled, Moltbot combines:
- **Vector similarity** (semantic match, wording can differ)
- **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)

If full-text search is unavailable on your platform, Moltbot falls back to vector-only search.

#### Why hybrid?

Vector search is great at “this means the same thing”:
- “Mac Studio gateway host” vs “the machine running the gateway”
- “debounce file updates” vs “avoid indexing on every write”

But it can be weak at exact, high-signal tokens:
- IDs (`a828e60`, `b3b9895a…`)
- code symbols (`memorySearch.query.hybrid`)
- error strings (“sqlite-vec unavailable”)

BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.
Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get
good results for both “natural language” queries and “needle in a haystack” queries.

#### How we merge results (the current design)

Implementation sketch:

1) Retrieve a candidate pool from both sides:
- **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.
- **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).

2) Convert BM25 rank into a 0..1-ish score:
- `textScore = 1 / (1 + max(0, bm25Rank))`

3) Union candidates by chunk id and compute a weighted score:
- `finalScore = vectorWeight * vectorScore + textWeight * textScore`

Notes:
- `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.
- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
- If FTS5 can’t be created, we keep vector-only search (no hard failure).

This isn’t “IR-theory perfect”, but it’s simple, fast, and tends to improve recall/precision on real notes.
If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization
(min/max or z-score) before mixing.

Config:

```json5
agents: {
  defaults: {
    memorySearch: {
      query: {
        hybrid: {
          enabled: true,
          vectorWeight: 0.7,
          textWeight: 0.3,
          candidateMultiplier: 4
        }
      }
    }
  }
}
```

### Embedding cache

Moltbot can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.

Config:

```json5
agents: {
  defaults: {
    memorySearch: {
      cache: {
        enabled: true,
        maxEntries: 50000
      }
    }
  }
}
```

### Session memory search (experimental)

You can optionally index **session transcripts** and surface them via `memory_search`.
This is gated behind an experimental flag.

```json5
agents: {
  defaults: {
    memorySearch: {
      experimental: { sessionMemory: true },
      sources: ["memory", "sessions"]
    }
  }
}
```

Notes:
- Session indexing is **opt-in** (off by default).
- Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort).
- `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes.
- Results still include snippets only; `memory_get` remains limited to memory files.
- Session indexing is isolated per agent (only that agent’s session logs are indexed).
- Session logs live on disk (`~/.clawdbot/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.

Delta thresholds (defaults shown):

```json5
agents: {
  defaults: {
    memorySearch: {
      sync: {
        sessions: {
          deltaBytes: 100000,   // ~100 KB
          deltaMessages: 50     // JSONL lines
        }
      }
    }
  }
}
```

### SQLite vector acceleration (sqlite-vec)

When the sqlite-vec extension is available, Moltbot stores embeddings in a
SQLite virtual table (`vec0`) and performs vector distance queries in the
database. This keeps search fast without loading every embedding into JS.

Configuration (optional):

```json5
agents: {
  defaults: {
    memorySearch: {
      store: {
        vector: {
          enabled: true,
          extensionPath: "/path/to/sqlite-vec"
        }
      }
    }
  }
}
```

Notes:
- `enabled` defaults to true; when disabled, search falls back to in-process
  cosine similarity over stored embeddings.
- If the sqlite-vec extension is missing or fails to load, Moltbot logs the
  error and continues with the JS fallback (no vector table).
- `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds
  or non-standard install locations).

### Local embedding auto-download

- Default local embedding model: `hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf` (~0.6 GB).
- When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry.
- Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`.
- Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason.

### Custom OpenAI-compatible endpoint example

```json5
agents: {
  defaults: {
    memorySearch: {
      provider: "openai",
      model: "text-embedding-3-small",
      remote: {
        baseUrl: "https://api.example.com/v1/",
        apiKey: "YOUR_REMOTE_API_KEY",
        headers: {
          "X-Organization": "org-id",
          "X-Project": "project-id"
        }
      }
    }
  }
}
```

Notes:
- `remote.*` takes precedence over `models.providers.openai.*`.
- `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								---
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								summary: "How Moltbot memory works (workspace files + automatic memory flush)"
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								read_when:
 								  - You want the memory file layout and workflow
 								  - You want to tune the automatic pre-compaction memory flush
 								---
 								# Memory
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								Moltbot memory is **plain Markdown in the agent workspace**. The files are the
-												fix: skip memory flush on read-only workspace

											
										
										
											2026-01-12 06:33:14 +00:00
+								source of truth; the model only "remembers" what gets written to disk.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
-												feat(plugins): add memory slot plugin

											
										
										
											2026-01-18 02:12:01 +00:00
+								Memory search tools are provided by the active memory plugin (default:
 								`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								## Memory files (Markdown)
 								The default workspace layout uses two memory layers:
 								- `memory/YYYY-MM-DD.md`
 								  - Daily log (append-only).
 								  - Read today + yesterday at session start.
 								- `MEMORY.md` (optional)
 								  - Curated long-term memory.
 								  - **Only load in the main, private session** (never in group contexts).
 								These files live under the workspace (`agents.defaults.workspace`, default
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								`~/clawd`). See [Agent workspace](/concepts/agent-workspace) for the full layout.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
 								## When to write memory
 								- Decisions, preferences, and durable facts go to `MEMORY.md`.
 								- Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`.
-												fix: skip memory flush on read-only workspace

											
										
										
											2026-01-12 06:33:14 +00:00
+								- If someone says "remember this," write it down (do not keep it in RAM).
-												docs: sweep support troubleshooting updates

											
										
										
											2026-01-25 04:33:14 +00:00
+								- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
-												docs: add tips + clawd-to-clawd faq

											
										
										
											2026-01-25 04:04:14 +00:00
+								- If you want something to stick, **ask the bot to write it** into memory.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
 								## Automatic memory flush (pre-compaction ping)
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								When a session is **close to auto-compaction**, Moltbot triggers a **silent,
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								agentic turn** that reminds the model to write durable memory **before** the
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								context is compacted. The default prompts explicitly say the model *may reply*,
-												docs: clarify memory flush behavior

											
										
										
											2026-01-12 07:39:44 +00:00
+								but usually `NO_REPLY` is the correct response so the user never sees this turn.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
 								This is controlled by `agents.defaults.compaction.memoryFlush`:
 								```json5
 								{
 								  agents: {
 								    defaults: {
 								      compaction: {
 								        reserveTokensFloor: 20000,
 								        memoryFlush: {
 								          enabled: true,
 								          softThresholdTokens: 4000,
 								          systemPrompt: "Session nearing compaction. Store durable memories now.",
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								          prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
 								        }
 								      }
 								    }
 								  }
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								}
 								```
 								Details:
 								- **Soft threshold**: flush triggers when the session token estimate crosses
 								  `contextWindow - reserveTokensFloor - softThresholdTokens`.
 								- **Silent** by default: prompts include `NO_REPLY` so nothing is delivered.
-												docs: clarify memory flush behavior

											
										
										
											2026-01-12 07:39:44 +00:00
+								- **Two prompts**: a user prompt plus a system prompt append the reminder.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
+								- **One flush per compaction cycle** (tracked in `sessions.json`).
-												fix: skip memory flush on read-only workspace

											
										
										
											2026-01-12 06:33:14 +00:00
+								- **Workspace must be writable**: if the session runs sandboxed with
 								  `workspaceAccess: "ro"` or `"none"`, the flush is skipped.
-												feat: add pre-compaction memory flush

											
										
										
											2026-01-12 05:28:17 +00:00
 								For the full compaction lifecycle, see
 								[Session management + compaction](/reference/session-management-compaction).
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
 								## Vector memory search
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								Moltbot can build a small vector index over `MEMORY.md` and `memory/*.md` so
 								semantic queries can find related notes even when wording differs.
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
 								Defaults:
 								- Enabled by default.
 								- Watches memory files for changes (debounced).
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- Uses remote embeddings by default. If `memorySearch.provider` is not set, Moltbot auto-selects:
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
 . `openai` if an OpenAI key can be resolved.
 . `gemini` if a Gemini key can be resolved.
 . Otherwise memory search stays disabled until configured.
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
+								- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
-												feat: add sqlite-vec memory search acceleration

											
										
										
											2026-01-17 18:02:25 +00:00
+								- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								Remote embeddings **require** an API key for the embedding provider. Moltbot
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
 								variables. Codex OAuth only covers chat/completions and does **not** satisfy
 								embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
 								`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
 								set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
-												feat: add remote config overrides to memorySearch

											
										
										
											2026-01-12 15:33:35 +00:00
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								### QMD backend (experimental)
 								Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
 								[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines
 								BM25 + vectors + reranking. Markdown stays the source of truth; Moltbot shells
 								out to QMD for retrieval. Key points:
 								**Prereqs**
 								- Disabled by default. Opt in per-config (`memory.backend = "qmd"`).
 								- Install the QMD CLI separately (`bun install -g github.com/tobi/qmd` or grab
 								  a release) and make sure the `qmd` binary is on the gateway’s `PATH`.
 								- QMD needs an SQLite build that allows extensions (`brew install sqlite` on
 								  macOS). The gateway sets `INDEX_PATH`/`QMD_CONFIG_DIR` automatically.
 								**How the sidecar runs**
 								- The gateway writes a self-contained QMD home under
 								  `~/.clawdbot/agents/<agentId>/qmd/` (config + cache + sqlite DB).
 								- Collections are rewritten from `memory.qmd.paths` (plus default workspace
 								  memory files) into `index.yml`, then `qmd update` + `qmd embed` run on boot and
 								  on a configurable interval (`memory.qmd.update.interval`, default 5 m).
 								- Searches run via `qmd query --json`. If QMD fails or the binary is missing,
 								  Moltbot automatically falls back to the builtin SQLite manager so memory tools
 								  keep working.
 								**Config surface (`memory.qmd.*`)**
 								- `command` (default `qmd`): override the executable path.
 								- `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`.
 								- `paths[]`: add extra directories/files (`path`, optional `pattern`, optional
 								  stable `name`).
 								- `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`,
-												Make memory more resilient to failure

											
										
										
											2026-01-27 22:17:56 -08:00
+								  `exportDir`).
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- `update`: controls refresh cadence (`interval`, `debounceMs`, `onBoot`).
 								- `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`,
 								  `maxInjectedChars`, `timeoutMs`).
 								- `scope`: same schema as [`session.sendPolicy`](/reference/configuration#session-sendpolicy).
 								  Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD
 								  hits in groups/channels.
 								- Snippets sourced outside the workspace show up as
 								  `qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get`
 								  understands that prefix and reads from the configured QMD collection root.
 								- When `memory.qmd.sessions.enabled = true`, Moltbot exports sanitized session
 								  transcripts (User/Assistant turns) into a dedicated QMD collection under
 								  `~/.clawdbot/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent
 								  conversations without touching the builtin SQLite index.
 								- `memory_search` snippets now include a `Source: <path#line>` footer when
 								  `memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep
 								  the path metadata internal (the agent still receives the path for
 								  `memory_get`, but the snippet text omits the footer and the system prompt
 								  warns the agent not to cite it).
 								**Example**
 								```json5
 								memory: {
 								  backend: "qmd",
 								  citations: "auto",
 								  qmd: {
 								    includeDefaultMemory: true,
 								    update: { interval: "5m", debounceMs: 15000 },
 								    limits: { maxResults: 6, timeoutMs: 4000 },
 								    scope: {
 								      default: "deny",
 								      rules: [{ action: "allow", match: { chatType: "direct" } }]
 								    },
 								    paths: [
 								      { name: "docs", path: "~/notes", pattern: "**/*.md" }
 								    ]
 								  }
 								}
 								```
 								**Citations & fallback**
 								- `memory.citations` applies regardless of backend (`auto`/`on`/`off`).
 								- When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which
 								  engine served the results. If the QMD subprocess exits or JSON output can’t be
 								  parsed, the search manager logs a warning and returns the builtin provider
 								  (existing Markdown embeddings) until QMD recovers.
-												fix: local updates for PR #3600

Co-authored-by: kira-ariaki <kira-ariaki@users.noreply.github.com>

											
										
										
											2026-01-28 21:49:38 -05:00
+								### Additional memory paths
 								If you want to index Markdown files outside the default workspace layout, add
 								explicit paths:
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"]
 								    }
 								  }
 								}
 								```
 								Notes:
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
-												fix: local updates for PR #3600

Co-authored-by: kira-ariaki <kira-ariaki@users.noreply.github.com>

											
										
										
											2026-01-28 21:49:38 -05:00
+								- Paths can be absolute or workspace-relative.
 								- Directories are scanned recursively for `.md` files.
 								- Only Markdown files are indexed.
 								- Symlinks are ignored (files or directories).
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								### Gemini embeddings (native)
 								Set the provider to `gemini` to use the Gemini embeddings API directly:
-												feat: add gemini memory embeddings

											
										
										
											2026-01-18 09:09:13 +00:00
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      provider: "gemini",
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								      model: "gemini-embedding-001",
-												feat: add gemini memory embeddings

											
										
										
											2026-01-18 09:09:13 +00:00
+								      remote: {
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								        apiKey: "YOUR_GEMINI_API_KEY"
-												feat: add gemini memory embeddings

											
										
										
											2026-01-18 09:09:13 +00:00
+								      }
 								    }
 								  }
 								}
 								```
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								Notes:
 								- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
 								- `remote.headers` lets you add extra headers if needed.
 								- Default model: `gemini-embedding-001`.
-												feat: add gemini memory embeddings

											
										
										
											2026-01-18 09:09:13 +00:00
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
 								you can use the `remote` configuration with the OpenAI provider:
-												feat: add remote config overrides to memorySearch

											
										
										
											2026-01-12 15:33:35 +00:00
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      provider: "openai",
 								      model: "text-embedding-3-small",
 								      remote: {
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								        baseUrl: "https://api.example.com/v1/",
 								        apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
-												feat: add remote config overrides to memorySearch

											
										
										
											2026-01-12 15:33:35 +00:00
+								        headers: { "X-Custom-Header": "value" }
 								      }
 								    }
 								  }
 								}
 								```
 								If you don't want to set an API key, use `memorySearch.provider = "local"` or set
-												docs: clarify memory search auth

											
										
										
											2026-01-13 00:25:54 +00:00
+								`memorySearch.fallback = "none"`.
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								Fallbacks:
 								- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
 								- The fallback provider is only used when the primary embedding provider fails.
 								Batch indexing (OpenAI + Gemini):
 								- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
-												feat: add OpenAI batch memory indexing

											
										
										
											2026-01-17 22:31:12 +00:00
+								- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
-												feat: speed up memory batch indexing

											
										
										
											2026-01-18 01:24:16 +00:00
+								- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
-												feat(memory): add gemini embeddings + auto select providers

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>

											
										
										
											2026-01-18 15:29:16 +00:00
+								- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
 								- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
-												feat: add OpenAI batch memory indexing

											
										
										
											2026-01-17 22:31:12 +00:00
-												feat: speed up memory batch indexing

											
										
										
											2026-01-18 01:24:16 +00:00
+								Why OpenAI batch is fast + cheap:
 								- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
 								- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
 								- See the OpenAI Batch API docs and pricing for details:
 								  - https://platform.openai.com/docs/api-reference/batch
 								  - https://platform.openai.com/pricing
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
+								Config example:
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      provider: "openai",
 								      model: "text-embedding-3-small",
 								      fallback: "openai",
-												feat: add OpenAI batch memory indexing

											
										
										
											2026-01-17 22:31:12 +00:00
+								      remote: {
-												feat: speed up memory batch indexing

											
										
										
											2026-01-18 01:24:16 +00:00
+								        batch: { enabled: true, concurrency: 2 }
-												feat: add OpenAI batch memory indexing

											
										
										
											2026-01-17 22:31:12 +00:00
+								      },
-												feat: add memory vector search

											
										
										
											2026-01-12 11:22:56 +00:00
+								      sync: { watch: true }
 								    }
 								  }
 								}
 								```
 								Tools:
 								- `memory_search` — returns snippets with file + line ranges.
 								- `memory_get` — read memory file content by path.
 								Local mode:
 								- Set `agents.defaults.memorySearch.provider = "local"`.
 								- Provide `agents.defaults.memorySearch.local.modelPath` (GGUF or `hf:` URI).
 								- Optional: set `agents.defaults.memorySearch.fallback = "none"` to avoid remote fallback.
-												docs: detail memory tools and local models

											
										
										
											2026-01-12 22:35:19 +00:00
 								### How the memory tools work
 								- `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected.
-												docs: detail memory tools and local models

											
										
										
											2026-01-12 22:35:19 +00:00
+								- Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.
 								### What gets indexed (and when)
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- File type: Markdown only (`MEMORY.md`, `memory/**/*.md`).
 								- Index storage: per-agent SQLite at `~/.clawdbot/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).
 								- Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.
 								- Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, Moltbot automatically resets and reindexes the entire store.
-												feat: add memory embedding cache

											
										
										
											2026-01-18 01:35:58 +00:00
-												feat: add hybrid memory search

											
										
										
											2026-01-18 01:42:25 +00:00
+								### Hybrid search (BM25 + vector)
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								When enabled, Moltbot combines:
-												feat: add hybrid memory search

											
										
										
											2026-01-18 01:42:25 +00:00
+								- **Vector similarity** (semantic match, wording can differ)
 								- **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								If full-text search is unavailable on your platform, Moltbot falls back to vector-only search.
-												feat: add hybrid memory search

											
										
										
											2026-01-18 01:42:25 +00:00
-												docs: expand memory hybrid search explainer

											
										
										
											2026-01-18 03:09:39 +00:00
+								#### Why hybrid?
 								Vector search is great at “this means the same thing”:
 								- “Mac Studio gateway host” vs “the machine running the gateway”
 								- “debounce file updates” vs “avoid indexing on every write”
 								But it can be weak at exact, high-signal tokens:
 								- IDs (`a828e60`, `b3b9895a…`)
 								- code symbols (`memorySearch.query.hybrid`)
 								- error strings (“sqlite-vec unavailable”)
 								BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.
 								Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get
 								good results for both “natural language” queries and “needle in a haystack” queries.
 								#### How we merge results (the current design)
 								Implementation sketch:
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+) Retrieve a candidate pool from both sides:
-												docs: expand memory hybrid search explainer

											
										
										
											2026-01-18 03:09:39 +00:00
+								- **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.
 								- **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+) Convert BM25 rank into a 0..1-ish score:
-												docs: expand memory hybrid search explainer

											
										
										
											2026-01-18 03:09:39 +00:00
+								- `textScore = 1 / (1 + max(0, bm25Rank))`
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+) Union candidates by chunk id and compute a weighted score:
-												docs: expand memory hybrid search explainer

											
										
										
											2026-01-18 03:09:39 +00:00
+								- `finalScore = vectorWeight * vectorScore + textWeight * textScore`
 								Notes:
 								- `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.
 								- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
 								- If FTS5 can’t be created, we keep vector-only search (no hard failure).
 								This isn’t “IR-theory perfect”, but it’s simple, fast, and tends to improve recall/precision on real notes.
 								If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization
 								(min/max or z-score) before mixing.
-												feat: add hybrid memory search

											
										
										
											2026-01-18 01:42:25 +00:00
+								Config:
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      query: {
 								        hybrid: {
 								          enabled: true,
 								          vectorWeight: 0.7,
 								          textWeight: 0.3,
 								          candidateMultiplier: 4
 								        }
 								      }
 								    }
 								  }
 								}
 								```
-												feat: add memory embedding cache

											
										
										
											2026-01-18 01:35:58 +00:00
+								### Embedding cache
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								Moltbot can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.
-												feat: add memory embedding cache

											
										
										
											2026-01-18 01:35:58 +00:00
 								Config:
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      cache: {
 								        enabled: true,
 								        maxEntries: 50000
 								      }
 								    }
 								  }
 								}
 								```
-												docs: detail memory tools and local models

											
										
										
											2026-01-12 22:35:19 +00:00
-												feat: add experimental session memory source

											
										
										
											2026-01-17 18:53:48 +00:00
+								### Session memory search (experimental)
 								You can optionally index **session transcripts** and surface them via `memory_search`.
 								This is gated behind an experimental flag.
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      experimental: { sessionMemory: true },
 								      sources: ["memory", "sessions"]
 								    }
 								  }
 								}
 								```
 								Notes:
 								- Session indexing is **opt-in** (off by default).
-												fix: make session memory indexing async

											
										
										
											2026-01-21 10:37:52 +00:00
+								- Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort).
 								- `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes.
-												feat: add experimental session memory source

											
										
										
											2026-01-17 18:53:48 +00:00
+								- Results still include snippets only; `memory_get` remains limited to memory files.
 								- Session indexing is isolated per agent (only that agent’s session logs are indexed).
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- Session logs live on disk (`~/.clawdbot/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
-												feat: add experimental session memory source

											
										
										
											2026-01-17 18:53:48 +00:00
-												fix: make session memory indexing async

											
										
										
											2026-01-21 10:37:52 +00:00
+								Delta thresholds (defaults shown):
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      sync: {
 								        sessions: {
 								          deltaBytes: 100000,   // ~100 KB
 								          deltaMessages: 50     // JSONL lines
 								        }
 								      }
 								    }
 								  }
 								}
 								```
-												feat: add sqlite-vec memory search acceleration

											
										
										
											2026-01-17 18:02:25 +00:00
+								### SQLite vector acceleration (sqlite-vec)
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								When the sqlite-vec extension is available, Moltbot stores embeddings in a
-												feat: add sqlite-vec memory search acceleration

											
										
										
											2026-01-17 18:02:25 +00:00
+								SQLite virtual table (`vec0`) and performs vector distance queries in the
 								database. This keeps search fast without loading every embedding into JS.
 								Configuration (optional):
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      store: {
 								        vector: {
 								          enabled: true,
 								          extensionPath: "/path/to/sqlite-vec"
 								        }
 								      }
 								    }
 								  }
 								}
 								```
 								Notes:
 								- `enabled` defaults to true; when disabled, search falls back to in-process
 								  cosine similarity over stored embeddings.
-												feat (memory): Implement new (opt-in) QMD memory backend

											
										
										
											2026-01-27 21:57:15 -08:00
+								- If the sqlite-vec extension is missing or fails to load, Moltbot logs the
-												feat: add sqlite-vec memory search acceleration

											
										
										
											2026-01-17 18:02:25 +00:00
+								  error and continues with the JS fallback (no vector table).
 								- `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds
 								  or non-standard install locations).
-												docs: detail memory tools and local models

											
										
										
											2026-01-12 22:35:19 +00:00
+								### Local embedding auto-download
 								- Default local embedding model: `hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf` (~0.6 GB).
 								- When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry.
 								- Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`.
 								- Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason.
-												docs: add custom memory endpoint example

											
										
										
											2026-01-13 03:21:45 +00:00
 								### Custom OpenAI-compatible endpoint example
 								```json5
 								agents: {
 								  defaults: {
 								    memorySearch: {
 								      provider: "openai",
 								      model: "text-embedding-3-small",
 								      remote: {
 								        baseUrl: "https://api.example.com/v1/",
 								        apiKey: "YOUR_REMOTE_API_KEY",
 								        headers: {
 								          "X-Organization": "org-id",
 								          "X-Project": "project-id"
 								        }
 								      }
 								    }
 								  }
 								}
 								```
 								Notes:
 								- `remote.*` takes precedence over `models.providers.openai.*`.
 								- `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults.