* feat(cron): add failure destination support with webhook mode and bestEffort handling
Extends PR #24789 failure alerts with features from PR #29145:
- Add webhook delivery mode for failure alerts (mode: 'webhook')
- Add accountId support for multi-account channel configurations
- Add bestEffort handling to skip alerts when job has bestEffort=true
- Add separate failureDestination config (global + per-job in delivery)
- Add duplicate prevention (prevents sending to same as primary delivery)
- Add CLI flags: --failure-alert-mode, --failure-alert-account-id
- Add UI fields for new options in web cron editor
* fix(cron): merge failureAlert mode/accountId and preserve failureDestination on updates
- Fix mergeCronFailureAlert to merge mode and accountId fields
- Fix mergeCronDelivery to preserve failureDestination on updates
- Fix isSameDeliveryTarget to use 'announce' as default instead of 'none'
to properly detect duplicates when delivery.mode is undefined
* fix(cron): validate webhook mode requires URL in resolveFailureDestination
When mode is 'webhook' but no 'to' URL is provided, return null
instead of creating an invalid plan that silently fails later.
* fix(cron): fail closed on webhook mode without URL and make failureDestination fields clearable
- sendCronFailureAlert: fail closed when mode is webhook but URL is missing
- mergeCronDelivery: use per-key presence checks so callers can clear
nested failureDestination fields via cron.update
Note: protocol:check shows missing internalEvents in Swift models - this is
a pre-existing issue unrelated to these changes (upstream sync needed).
* fix(cron): use separate schema for failureDestination and fix type cast
- Create CronFailureDestinationSchema excluding after/cooldownMs fields
- Fix type cast in sendFailureNotificationAnnounce to use CronMessageChannel
* fix(cron): merge global failureDestination with partial job overrides
When job has partial failureDestination config, fall back to global
config for unset fields instead of treating it as a full override.
* fix(cron): avoid forcing announce mode and clear inherited to on mode change
- UI: only include mode in patch if explicitly set to non-default
- delivery.ts: clear inherited 'to' when job overrides mode, since URL
semantics differ between announce and webhook modes
* fix(cron): preserve explicit to on mode override and always include mode in UI patches
- delivery.ts: preserve job-level explicit 'to' when overriding mode
- UI: always include mode in failureAlert patch so users can switch between announce/webhook
* fix(cron): allow clearing accountId and treat undefined global mode as announce
- UI: always include accountId in patch so users can clear it
- delivery.ts: treat undefined global mode as announce when comparing for clearing inherited 'to'
* Cron: harden failure destination routing and add regression coverage
* Cron: resolve failure destination review feedback
* Cron: drop unrelated timeout assertions from conflict resolution
* Cron: format cron CLI regression test
* Cron: align gateway cron test mock types
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
Gateway browser.request only read profile from query.profile before invoking
browser.proxy on nodes. Calls that passed profile in POST body silently fell
back to the default profile, which could switch users into chrome extension
mode even when they explicitly requested openclaw profile.
Use query profile first, then fall back to body.profile when present.
Closes#28687
* fix(gateway): skip device pairing for local backend self-connections
When gateway.tls is enabled, sessions_spawn (and other internal
callGateway operations) creates a new WebSocket to the gateway.
The gateway treated this self-connection like any external client
and enforced device pairing, rejecting it with "pairing required"
(close code 1008). This made sub-agent spawning impossible when
TLS was enabled in Docker with bind: "lan".
Skip pairing for connections that are gateway-client self-connections
from localhost with valid shared auth (token/password). These are
internal backend calls (e.g. sessions_spawn, subagent-announce) that
already have valid credentials and connect from the same host.
Closes#30740
* gateway: tighten backend self-pair bypass guard
* tests: cover backend self-pairing local-vs-remote auth path
* changelog: add gateway tls pairing fix credit
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(agents): honor per-model thinking defaults
* fix(agents): preserve thinking fallback with model defaults
---------
Co-authored-by: Mark L <73659136+markliuyuxiang@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
When gateway.controlUi.allowedOrigins is set to ["*"], the Control UI
WebSocket was still rejected with "origin not allowed" for any non-
loopback origin (e.g. Tailscale IPs, LAN addresses).
Root cause: checkBrowserOrigin() compared each allowedOrigins entry
against the parsed request origin via a literal Array#includes(). The
entry "*" never equals an actual origin string, so the wildcard was
silently ignored and all remote connections were blocked.
Fix: check for the literal "*" entry before the per-origin comparison
and return ok:true immediately when found.
Closes#30990
* fix(gateway): support wildcard in controlUi.allowedOrigins for remote access
* build: regenerate host env security policy swift
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* feat: detect stale Slack sockets and auto-restart
Slack Socket Mode connections can silently stop delivering events while
still appearing connected (health checks pass, WebSocket stays open).
This "half-dead socket" problem causes messages to go unanswered.
This commit adds two layers of protection:
1. **Event liveness tracking**: Every inbound Slack event (messages,
reactions, member joins/leaves, channel events, pins) now calls
`setStatus({ lastEventAt, lastInboundAt })` to update the channel
account snapshot with the timestamp of the last received event.
2. **Health monitor stale socket detection**: The channel health monitor
now checks `lastEventAt` against a configurable threshold (default
30 minutes). If a channel has been running longer than the threshold
and hasn't received any events in that window, it is flagged as
unhealthy and automatically restarted — the same way disconnected
or crashed channels are already handled.
The restart reason is logged as "stale-socket" for observability, and
the existing cooldown/rate-limit logic (3 restarts/hour max) prevents
restart storms.
* Slack: gate liveness tracking to accepted events
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
* fix(cron): pass heartbeat target=last for main-session cron jobs
When a cron job with sessionTarget=main and wakeMode=now fires, it
triggers a heartbeat via runHeartbeatOnce. Since e2362d35 changed the
default heartbeat target from "last" to "none", these cron-triggered
heartbeats silently discard their responses instead of delivering them
to the last active channel (e.g. Telegram).
Fix: pass heartbeat: { target: "last" } from the cron timer to
runHeartbeatOnce for main-session jobs, and wire the override through
the gateway cron service builder. This restores delivery for
sessionTarget=main cron jobs without reverting the intentional default
change for regular heartbeats.
Regression introduced in: e2362d35 (2026-02-25)
Fixes#28508
* Cron: align server-cron wake routing expectations for main-target jobs
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
* Gateway: allow Google Fonts stylesheet and font CDN in Control UI CSP
* Tests: assert Control UI CSP allows required Google Fonts origins
* Gateway: fix CSP comment for Google Fonts allowlist intent
* Tests: split dedicated Google Fonts CSP assertion