flushBuffer/flushKey now return whether messages were actually flushed,
so flushAll only increments flushedBufferCount for non-empty buffers.
Prevents idle registered debouncers from triggering unnecessary followup
queue drain waits during SIGUSR1 restart. Also wraps per-key flush in
try/catch so one onError throw cannot strand later buffered messages.
When config.patch triggers a SIGUSR1 restart, two in-memory message
buffers were silently wiped:
1. Per-channel inbound debounce buffers (closure-local Map + setTimeout)
2. Followup queues (global Map of pending session messages)
This caused inbound messages received during the debounce window to be
permanently lost on config-triggered gateway restarts.
Fix:
- Add a global registry of inbound debouncers so they can be flushed
collectively during restart. Each createInboundDebouncer() call now
auto-registers in a shared Symbol.for() map, with a new flushAll()
method that immediately processes all buffered items.
- Add flushAllInboundDebouncers() which iterates the global registry
and forces all debounce timers to fire immediately.
- Add waitForFollowupQueueDrain() which polls the FOLLOWUP_QUEUES map
until all queues finish processing (or timeout).
- Hook both into the SIGUSR1 restart flow in run-loop.ts: before
markGatewayDraining(), flush all debouncers first (pushing buffered
messages into the followup queues), then wait up to 5s for the
followup drain loops to process them.
The ordering is critical: flush debouncers → wait for followup drain →
then mark draining. This ensures messages that were mid-debounce get
delivered to sessions before the gateway reinitializes.
Tests:
- flushAllInboundDebouncers: flushes multiple registered debouncers,
returns count, deregisters after flush
- createInboundDebouncer.flushAll: flushes all keys in a single debouncer
- waitForFollowupQueueDrain: immediate return when empty, waits for
drain, returns not-drained on timeout, counts draining queues
- run-loop: SIGUSR1 calls flush before markGatewayDraining, skips
followup wait when no debouncers had buffered messages, logs warning
on followup drain timeout
* test: align extension runtime mocks with plugin-sdk
Update stale extension tests to mock the plugin-sdk runtime barrels that production code now imports, and harden the Signal tool-result harness around system-event assertions so the channels lane matches current extension boundaries.
Regeneration-Prompt: |
Verify the failing channels-lane tests against current origin/main in an isolated worktree before changing anything. If the failures reproduce on main, keep the fix test-only unless production behavior is clearly wrong. Recent extension refactors moved Telegram, WhatsApp, and Signal code onto plugin-sdk runtime barrels, so update stale tests that still mock old core module paths to intercept the seams production code now uses. For Signal reaction notifications, avoid brittle assertions that depend on shared queued system-event state when a direct harness spy on enqueue behavior is sufficient. Preserve scope: only touch the failing tests and their local harness, then rerun the reproduced targeted tests plus the full channels lane and repo check gate.
* test: fix extension test drift on main
* fix: lazy-load bundled web search plugin registry
* test: make matrix sweeper failure injection portable
* fix: split heavy matrix runtime-api seams
* fix: simplify bundled web search id lookup
* test: tolerate windows env key casing
Reuse pi-ai's Anthropic client injection seam for streaming, and add
the OpenClaw-side provider discovery, auth, model catalog, and tests
needed to expose anthropic-vertex cleanly.
Signed-off-by: sallyom <somalley@redhat.com>