The UsageAccumulator sums cacheRead/cacheWrite across all API calls
within a single turn. With Anthropic prompt caching, each call reports
cacheRead ≈ current_context_size, so after N tool-call round-trips the
accumulated total becomes N × actual_context, which gets clamped to
contextWindow (200k) by deriveSessionTotalTokens().
Fix: track the most recent API call's cache fields separately and use
them in toNormalizedUsage() for context-size reporting. This makes
/status Context display accurate while preserving accumulated output
token counts.
Fixes#13698Fixes#13782
Co-authored-by: akari-musubi <259925157+akari-musubi@users.noreply.github.com>
* fix: prevent FD leaks in child process cleanup
- Destroy stdio streams (stdin/stdout/stderr) after process exit
- Remove event listeners to prevent memory leaks
- Clean up child process reference in moveToFinished()
- Also fixes model override handling in agent.ts
Fixes EBADF errors caused by accumulating file descriptors
from sub-agent spawns.
* Fix: allow stdin destroy in process registry cleanup
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
Two fixes for Google Antigravity (Cloud Code Assist) reliability:
1. Forward-compat model fallback: pi-ai's model registry doesn't include
claude-opus-4-6-thinking. Add resolveAntigravityOpus46ForwardCompatModel()
that clones the opus-4-5 template so the correct api ("google-gemini-cli")
and baseUrl are preserved. Fixes#13765.
2. Fix thinking.signature rejection: The API returns Claude thinking blocks
without signatures, then rejects them on replay. The existing sanitizer
strips unsigned blocks, but the orphaned-user-message path in attempt.ts
bypassed it by reading directly from disk. Now applies
sanitizeAntigravityThinkingBlocks at that code path.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Move appendCacheTtlTimestamp() to after prompt + compaction retry
completes instead of before. The previous placement inserted a custom
entry (openclaw.cache-ttl) between compaction and the next prompt,
which broke pi-coding-agent's prepareCompaction() guard — the guard
only checks if the last entry is type 'compaction', and the cache-ttl
custom entry made it type 'custom', allowing an immediate second
compaction at very low token counts (e.g. 5,545 tokens) that nuked
all preserved context.
Fixes#9282
Relates to #12170
* feat: add --localTime options to make logs to show time with local time zone
fix#12447
* fix: prep logs local-time option and docs (#13818) (thanks @xialonglee)
---------
Co-authored-by: xialonglee <li.xialong@xydigit.com>
Co-authored-by: Sebastian <19554889+sebslight@users.noreply.github.com>
The xAI /v1/responses API returns content in a structured format with
typed output blocks (type: 'message') containing typed content blocks
(type: 'output_text') and url_citation annotations. The previous code
only checked output[0].content[0].text without filtering by type,
which could miss content in responses with multiple output entries.
Changes:
- Update GrokSearchResponse type to include annotations on content blocks
- Filter output blocks by type='message' and content by type='output_text'
- Extract url_citation annotations as fallback citations when top-level
citations array is empty
- Deduplicate annotation-derived citation URLs
- Update tests for the new structured return type
Closes#13520
* feat: add LiteLLM provider types, env var, credentials, and auth choice
Add litellm-api-key auth choice, LITELLM_API_KEY env var mapping,
setLitellmApiKey() credential storage, and LITELLM_DEFAULT_MODEL_REF.
* feat: add LiteLLM onboarding handler and provider config
Add applyLitellmProviderConfig which properly registers
models.providers.litellm with baseUrl, api type, and model definitions.
This fixes the critical bug from PR #6488 where the provider entry was
never created, causing model resolution to fail at runtime.
* docs: add LiteLLM provider documentation
Add setup guide covering onboarding, manual config, virtual keys,
model routing, and usage tracking. Link from provider index.
* docs: add LiteLLM to sidebar navigation in docs.json
Add providers/litellm to both English and Chinese provider page lists
so the docs page appears in the sidebar navigation.
* test: add LiteLLM non-interactive onboarding test
Wire up litellmApiKey flag inference and auth-choice handler for the
non-interactive onboarding path, and add an integration test covering
profile, model default, and credential storage.
* fix: register --litellm-api-key CLI flag and add preferred provider mapping
Wire up the missing Commander CLI option, action handler mapping, and
help text for --litellm-api-key. Add litellm-api-key to the preferred
provider map for consistency with other providers.
* fix: remove zh-CN sidebar entry for litellm (no localized page yet)
* style: format buildLitellmModelDefinition return type
* fix(onboarding): harden LiteLLM provider setup (#12823)
* refactor(onboarding): keep auth-choice provider dispatcher under size limit
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* Heartbeat: inject cron-style current time into prompts
* Tests: fix type for web heartbeat timestamp test
* Infra: inline heartbeat current-time injection
xAI's /v1/responses endpoint does not support the 'include' parameter,
returning 400 'Argument not supported: include'. Inline citations are
returned automatically when available — no explicit request needed.
Closes#12910
Co-authored-by: Luna AI <luna@coredirection.ai>
- Add [Historical context: ...] marker pattern to stripDowngradedToolCallText
- Apply stripDowngradedToolCallText in emitBlockChunk streaming path
- Previously only stripBlockTags ran during streaming, leaking [Tool Call: ...] markers to users
- Add 7 test cases for the new pattern stripping
* fix(tools): correct Grok response parsing for xAI Responses API
The xAI Responses API returns content in output[0].content[0].text,
not in output_text field. Updated GrokSearchResponse type and
runGrokSearch to extract content from the correct path.
Fixes the 'No response' issue when using Grok web search.
* fix(tools): harden Grok web_search parsing (#13049) (thanks @ereid7)
---------
Co-authored-by: erai <erai@erais-Mac-mini.local>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
* fix: suggest /reset in context overflow error message
When the context window overflows, the error message now suggests
using /reset to clear session history, giving users an actionable
recovery path instead of a dead-end error.
Closes#12940
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: suggest /reset in context overflow error message (#12973) (thanks @RamiNoodle733)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Rami Abdelrazzaq <RamiNoodle733@users.noreply.github.com>
* fix(web_search): Fix invalid model name sent to Perplexity
* chore: Only apply fix to direct Perplexity calls
* fix(web_search): normalize direct Perplexity model IDs
* fix: add changelog note for perplexity model normalization (#12795) (thanks @cdorsey)
* fix: align tests and fetch type for gate stability (#12795) (thanks @cdorsey)
* chore: keep #12795 scoped to web_search changes
---------
Co-authored-by: Sebastian <19554889+sebslight@users.noreply.github.com>
- Create shared PNG encoder module (src/media/png-encode.ts)
- Refactor qr-image.ts and live-image-probe.ts to use shared encoder
- Add safeParseJson to utils.ts and plugin-sdk exports
- Update msteams and pairing-store to use centralized safeParseJson
* refactor: consolidate duplicate utility functions
- Add escapeRegExp to src/utils.ts and remove 10 local duplicates
- Rename bash-tools clampNumber to clampWithDefault (different signature)
- Centralize formatError calls to use formatErrorMessage from infra/errors.ts
- Re-export formatErrorMessage from cli/cli-utils.ts to preserve API
* refactor: consolidate remaining escapeRegExp duplicates
* refactor: consolidate sleep, stripAnsi, and clamp duplicates
* Memory/QMD: symlink default model cache into custom XDG_CACHE_HOME
QmdMemoryManager overrides XDG_CACHE_HOME to isolate the qmd index
per-agent, but this also moves where qmd looks for its ML models
(~2.1GB). Since models are installed at the default location
(~/.cache/qmd/models/), every qmd invocation would attempt to
re-download them from HuggingFace and time out.
Fix: on initialization, symlink ~/.cache/qmd/models/ into the custom
XDG_CACHE_HOME path so the index stays isolated per-agent while the
shared models are reused. The symlink is only created when the default
models directory exists and the target path does not already exist.
Includes tests for the three key scenarios: symlink creation, existing
directory preservation, and graceful skip when no default models exist.
* Memory/QMD: skip model symlink warning on ENOENT
* test: stabilize warning-filter visibility assertion (#12114) (thanks @tyler6204)
* fix: add changelog entry for QMD cache reuse (#12114) (thanks @tyler6204)
* fix: handle plain context-overflow strings in compaction detection (#12114) (thanks @tyler6204)