* fix(cache): inject cache_control into system prompt for OpenRouter Anthropic
Add onPayload wrapper that injects cache_control: { type: "ephemeral" }
into the system/developer message content for OpenRouter requests routed
to Anthropic models. The system prompt is typically ~18k tokens and was
being re-processed on every request without caching.
Fixes#15151
* Changelog: add OpenRouter note for #17473
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
extraParams.provider was silently dropped by createStreamFnWithExtraParams().
This change injects it into model.compat.openRouterRouting so pi-ai's
buildParams includes params.provider in the API request body.
Enables OpenRouter provider routing options (only, order, allow_fallbacks,
data_collection, ignore, sort, quantizations) via model config:
```jsonc
"openrouter/model-name": {
"params": {
"provider": {
"only": ["deepinfra", "fireworks"],
"allow_fallbacks": false
}
}
}
```
Closes#10869✍️ Author: Claude Code with @carrotRakko (AI-written, human-approved)
Add support for Z.AI's native tool_stream parameter to enable real-time
visibility into model reasoning and tool call execution.
- Automatically inject tool_stream=true for zai/z-ai providers
- Allow disabling via params.tool_stream: false in model config
- Follows existing pattern of OpenRouter and OpenAI wrappers
This enables Z.AI API features described in:
https://docs.z.ai/api-reference#streaming
AI-assisted: Claude (OpenClaw agent) helped write this implementation.
Testing: lightly tested (code review + pattern matching existing wrappers)
Closes#18135
- Update @mariozechner/pi-ai and pi-agent-core to 0.50.9
- Rename cacheControlTtl to cacheRetention with values none/short/long
- Add backwards compatibility mapping: 5m->short, 1h->long
- Remove dead OpenRouter check (uses openai-completions API)
- Default new configs to cacheRetention: short