fix: don't consume replyToMode=first slot for compaction notices
Compaction start/end notices are transient status messages that should be threaded (appear in-context) but must not advance the hasThreaded flag inside createReplyToModeFilter when mode=first. Before this fix, the compaction start notice was the "first" threaded message, so all real assistant reply chunks that followed had replyToId stripped and were sent as unthreaded top-level messages. Fix: skip advancing hasThreaded when payload.isCompactionNotice is true. The notice still receives replyToId (so it appears in the thread), but the filter's stateful "first" slot is preserved for the actual assistant reply that follows.
This commit is contained in:
parent
e7fd0a7b21
commit
1e381c6c8c
@ -44,7 +44,13 @@ export function createReplyToModeFilter(
|
||||
if (hasThreaded) {
|
||||
return { ...payload, replyToId: undefined };
|
||||
}
|
||||
hasThreaded = true;
|
||||
// Compaction notices are transient status messages — they should be
|
||||
// threaded (so they appear in-context), but they must not consume the
|
||||
// "first" slot of the replyToMode=first filter. Skip advancing
|
||||
// hasThreaded so the real assistant reply still gets replyToId.
|
||||
if (!payload.isCompactionNotice) {
|
||||
hasThreaded = true;
|
||||
}
|
||||
return payload;
|
||||
};
|
||||
}
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user