45 lines
1.4 KiB
TypeScript
Raw Normal View History

fix: drain inbound debounce buffer and followup queues before SIGUSR1 reload When config.patch triggers a SIGUSR1 restart, two in-memory message buffers were silently wiped: 1. Per-channel inbound debounce buffers (closure-local Map + setTimeout) 2. Followup queues (global Map of pending session messages) This caused inbound messages received during the debounce window to be permanently lost on config-triggered gateway restarts. Fix: - Add a global registry of inbound debouncers so they can be flushed collectively during restart. Each createInboundDebouncer() call now auto-registers in a shared Symbol.for() map, with a new flushAll() method that immediately processes all buffered items. - Add flushAllInboundDebouncers() which iterates the global registry and forces all debounce timers to fire immediately. - Add waitForFollowupQueueDrain() which polls the FOLLOWUP_QUEUES map until all queues finish processing (or timeout). - Hook both into the SIGUSR1 restart flow in run-loop.ts: before markGatewayDraining(), flush all debouncers first (pushing buffered messages into the followup queues), then wait up to 5s for the followup drain loops to process them. The ordering is critical: flush debouncers → wait for followup drain → then mark draining. This ensures messages that were mid-debounce get delivered to sessions before the gateway reinitializes. Tests: - flushAllInboundDebouncers: flushes multiple registered debouncers, returns count, deregisters after flush - createInboundDebouncer.flushAll: flushes all keys in a single debouncer - waitForFollowupQueueDrain: immediate return when empty, waits for drain, returns not-drained on timeout, counts draining queues - run-loop: SIGUSR1 calls flush before markGatewayDraining, skips followup wait when no debouncers had buffered messages, logs warning on followup drain timeout
2026-03-14 11:54:01 -04:00
import { FOLLOWUP_QUEUES } from "./state.js";
/**
* Wait for all followup queues to finish draining, up to `timeoutMs`.
* Returns `{ drained: true }` if all queues are empty, or `{ drained: false }`
* if the timeout was reached with items still pending.
*
* Called during SIGUSR1 restart after flushing inbound debouncers, so the
* newly enqueued items have time to be processed before the server tears down.
*/
export async function waitForFollowupQueueDrain(
timeoutMs: number,
): Promise<{ drained: boolean; remaining: number }> {
const deadline = Date.now() + timeoutMs;
const POLL_INTERVAL_MS = 50;
const getPendingCount = (): number => {
let total = 0;
for (const queue of FOLLOWUP_QUEUES.values()) {
// Add 1 for the in-flight item owned by an active drain loop.
const queuePending = queue.items.length + (queue.draining ? 1 : 0);
total += queuePending;
fix: drain inbound debounce buffer and followup queues before SIGUSR1 reload When config.patch triggers a SIGUSR1 restart, two in-memory message buffers were silently wiped: 1. Per-channel inbound debounce buffers (closure-local Map + setTimeout) 2. Followup queues (global Map of pending session messages) This caused inbound messages received during the debounce window to be permanently lost on config-triggered gateway restarts. Fix: - Add a global registry of inbound debouncers so they can be flushed collectively during restart. Each createInboundDebouncer() call now auto-registers in a shared Symbol.for() map, with a new flushAll() method that immediately processes all buffered items. - Add flushAllInboundDebouncers() which iterates the global registry and forces all debounce timers to fire immediately. - Add waitForFollowupQueueDrain() which polls the FOLLOWUP_QUEUES map until all queues finish processing (or timeout). - Hook both into the SIGUSR1 restart flow in run-loop.ts: before markGatewayDraining(), flush all debouncers first (pushing buffered messages into the followup queues), then wait up to 5s for the followup drain loops to process them. The ordering is critical: flush debouncers → wait for followup drain → then mark draining. This ensures messages that were mid-debounce get delivered to sessions before the gateway reinitializes. Tests: - flushAllInboundDebouncers: flushes multiple registered debouncers, returns count, deregisters after flush - createInboundDebouncer.flushAll: flushes all keys in a single debouncer - waitForFollowupQueueDrain: immediate return when empty, waits for drain, returns not-drained on timeout, counts draining queues - run-loop: SIGUSR1 calls flush before markGatewayDraining, skips followup wait when no debouncers had buffered messages, logs warning on followup drain timeout
2026-03-14 11:54:01 -04:00
}
return total;
};
let remaining = getPendingCount();
if (remaining === 0) {
return { drained: true, remaining: 0 };
}
while (Date.now() < deadline) {
await new Promise<void>((resolve) => {
const timer = setTimeout(resolve, Math.min(POLL_INTERVAL_MS, deadline - Date.now()));
timer.unref?.();
});
remaining = getPendingCount();
if (remaining === 0) {
return { drained: true, remaining: 0 };
}
}
return { drained: false, remaining };
}