5 Commits

Author SHA1 Message Date
Peter Steinberger
f00038b383 fix(testing): stabilize live model runs 2026-01-11 04:22:35 +00:00
Peter Steinberger
20b4e2b859 fix: stabilize live probes and docs 2026-01-11 02:26:39 +00:00
Peter Steinberger
aa30995aa1 test(live): add provider filters + google skip rules 2026-01-10 21:16:59 +00:00
Peter Steinberger
212b13b099 fix: repair tool-use history for anthropic 2026-01-10 19:15:57 +00:00
Peter Steinberger
cb10682d3e fix(openai): avoid invalid reasoning replay 2026-01-10 00:45:10 +00:00