zidongdesign 1e381c6c8c
fix: don't consume replyToMode=first slot for compaction notices
Compaction start/end notices are transient status messages that should
be threaded (appear in-context) but must not advance the hasThreaded
flag inside createReplyToModeFilter when mode=first.

Before this fix, the compaction start notice was the "first" threaded
message, so all real assistant reply chunks that followed had replyToId
stripped and were sent as unthreaded top-level messages.

Fix: skip advancing hasThreaded when payload.isCompactionNotice is true.
The notice still receives replyToId (so it appears in the thread), but
the filter's stateful "first" slot is preserved for the actual assistant
reply that follows.
2026-03-20 19:52:27 -07:00
..
2026-03-15 21:39:49 -07:00
2026-03-17 07:06:25 +00:00
2026-03-17 07:06:25 +00:00