P2-1 (agent-runner.ts): Restrict direct completion notice to
block-streaming runs. The condition now checks blockStreamingEnabled
in addition to opts?.onBlockReply, preventing duplicate completion
notices in non-streaming sessions where verboseNotices already handles
the compaction-complete text.
P2-2 (agent-runner-execution.ts): Emit compaction start notice when
streaming is off. blockReplyHandler is a no-op for non-streaming runs,
so add a direct fallback path: when blockStreamingEnabled is false and
opts.onBlockReply is present, send the start notice directly with
applyReplyToMode threading applied.
Enqueueing the completion notice into blockReplyPipeline before flush
caused didStream() to return true even when no assistant content was
streamed. buildReplyPayloads drops all finalPayloads when didStream()
is true, so the real assistant reply could be silently discarded on
non-streaming model paths (e.g. pi-embedded-subscribe) that fill
assistantTexts without emitting block replies.
Fix: move the completion notice send to *after* pipeline flush+stop,
using a fire-and-forget Promise.race with blockReplyTimeoutMs. This
keeps the timeout guarantee (satisfying the previous P1) while not
touching didStream() at all.
Non-streaming fallback (verboseNotices) is unchanged.
Addresses P1 review comment on PR #38805.
Previously the completion notice bypassed the block-reply pipeline by
calling opts.onBlockReply directly after the pipeline had already been
flushed and stopped. This meant timeout/abort handling and serial
delivery guarantees did not apply to the notice, risking stalls or
out-of-order delivery in streaming/routed runs.
Fix: enqueue the completion notice into blockReplyPipeline *before*
flush so it is delivered through the same path as every other block
reply. The non-streaming fallback (verboseNotices) is preserved for
runs where no pipeline exists.
Also removes the now-unnecessary direct opts.onBlockReply call and
cleans up the redundant suffix in the pre-flush path (count suffix is
still included in the verboseNotices fallback path where count is
available).
Addresses P1 review comment on PR #38805.
Compaction start and completion notices were sent via raw
opts.onBlockReply, bypassing createBlockReplyDeliveryHandler and the
applyReplyToMode pipeline. In channels configured with
replyToMode=all|first, this caused compaction notices to be delivered
as unthreaded top-level messages while all other replies stayed
threaded — inconsistent and noisy.
Fix agent-runner-execution.ts: extract createBlockReplyDeliveryHandler
result into blockReplyHandler and share it between onBlockReply and the
compaction start notice in onAgentEvent. Both now use the same handler.
Fix agent-runner.ts: inject currentMessageId + replyToCurrent into the
completion notice payload before passing through applyReplyToMode, so
threading directives are honoured consistently with normal replies.
Closes the P2 review comment on PR #38805 (agent-runner.ts:701).
In block-streaming mode, the reply pipeline bypasses buildReplyPayloads,
so notices only pushed to verboseNotices were never delivered to the user.
The start notice ("🧹 Compacting context...") was already sent via
opts.onBlockReply directly in agent-runner-execution.ts; mirror the same
path for the completion notice.
- If opts.onBlockReply is present (streaming mode): await onBlockReply
with the completion text directly, so it reaches the user immediately.
- Otherwise (non-streaming): push to verboseNotices as before so it gets
prepended to the final payload batch.
Also consolidate the verbose vs. non-verbose text selection into a single
completionText variable, removing the redundant pop/push pattern.
During auto-compaction the agent goes silent for several seconds while
the context is summarised. Users on every channel (Discord, Feishu,
Telegram, webchat …) had no indication that something was happening —
leading to confusion and duplicate messages.
Changes:
- agent-runner-execution.ts: listen for compaction phase='start' event
and immediately deliver a "🧹 Compacting context..." notice via the
existing onBlockReply callback. This fires for every channel because
onBlockReply is the universal in-run delivery path.
- agent-runner.ts: make the completion notice unconditional (was
previously guarded behind verboseEnabled). Non-verbose users now see
"✅ Context compacted (count N)."; verbose users continue to see the
legacy "🧹 Auto-compaction complete (count N)." wording.
Why onBlockReply for start?
onBlockReply is already wired to every channel adapter and fires during
the live run, so the notice arrives in-band with zero new plumbing.
Using verboseNotices (appended after the run) would be too late and
would miss the start signal entirely.
Fixes: users seeing silent pauses of 5-15 s with no feedback during
compaction on any channel.
* fix: make cleanup "keep" persist subagent sessions indefinitely
* feat: expose subagent session metadata in sessions list
* fix: include status and timing in sessions_list tool
* fix: hide injected timestamp prefixes in chat ui
* feat: push session list updates over websocket
* feat: expose child subagent sessions in subagents list
* feat: add admin http endpoint to kill sessions
* Emit session.message websocket events for transcript updates
* Estimate session costs in sessions list
* Add direct session history HTTP and SSE endpoints
* Harden dashboard session events and history APIs
* Add session lifecycle gateway methods
* Add dashboard session API improvements
* Add dashboard session model and parent linkage support
* fix: tighten dashboard session API metadata
* Fix dashboard session cost metadata
* Persist accumulated session cost
* fix: stop followup queue drain cfg crash
* Fix dashboard session create and model metadata
* fix: stop guessing session model costs
* Gateway: cache OpenRouter pricing for configured models
* Gateway: add timeout session status
* Fix subagent spawn test config loading
* Gateway: preserve operator scopes without device identity
* Emit user message transcript events and deduplicate plugin warnings
* feat: emit sessions.changed lifecycle event on subagent spawn
Adds a session-lifecycle-events module (similar to transcript-events)
that emits create events when subagents are spawned. The gateway
server.impl.ts listens for these events and broadcasts sessions.changed
with reason=create to SSE subscribers, so dashboards can pick up new
subagent sessions without polling.
* Gateway: allow persistent dashboard orchestrator sessions
* fix: preserve operator scopes for token-authenticated backend clients
Backend clients (like agent-dashboard) that authenticate with a valid gateway
token but don't present a device identity were getting their scopes stripped.
The scope-clearing logic ran before checking the device identity decision,
so even when evaluateMissingDeviceIdentity returned 'allow' (because
roleCanSkipDeviceIdentity passed for token-authed operators), scopes were
already cleared.
Fix: also check decision.kind before clearing scopes, so token-authenticated
operators keep their requested scopes.
* Gateway: allow operator-token session kills
* Fix stale active subagent status after follow-up runs
* Fix dashboard image attachments in sessions send
* Fix completed session follow-up status updates
* feat: stream session tool events to operator UIs
* Add sessions.steer gateway coverage
* Persist subagent timing in session store
* Fix subagent session transcript event keys
* Fix active subagent session status in gateway
* bump session label max to 512
* Fix gateway send session reactivation
* fix: publish terminal session lifecycle state
* feat: change default session reset to effectively never
- Change DEFAULT_RESET_MODE from "daily" to "idle"
- Change DEFAULT_IDLE_MINUTES from 60 to 0 (0 = disabled/never)
- Allow idleMinutes=0 through normalization (don't clamp to 1)
- Treat idleMinutes=0 as "no idle expiry" in evaluateSessionFreshness
- Default behavior: mode "idle" + idleMinutes 0 = sessions never auto-reset
- Update test assertion for new default mode
* fix: prep session management followups (#50101) (thanks @clay-datacurve)
---------
Co-authored-by: Tyler Yust <TYTYYUST@YAHOO.COM>
* MiniMax: add M2.7 models and update default to M2.7
- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to provider catalog and model definitions
- Update default model from MiniMax-M2.5 to MiniMax-M2.7 across onboard, portal, and provider configs
- Update isModernMiniMaxModel to recognize M2.7 prefix
- Update all test fixtures to reflect M2.7 as default
Made-with: Cursor
* MiniMax: add extension test for model definitions
* update 2.7
* feat: add MiniMax M2.7 models and update default (#49691) (thanks @liyuan97)
---------
Co-authored-by: George Zhang <georgezhangtj97@gmail.com>