openclaw

Author	SHA1	Message	Date
jiarung	00e05fb4e5	fix(usage-log): do not delete lock on timeout — holder may still be active Unconditionally unlinking the lock file after LOCK_TIMEOUT_MS is unsafe: the holder may legitimately still be running (slow disk, large usage file), so removing its lock breaks mutual exclusion and allows concurrent read-modify-write cycles to overwrite each other's entries. Remove the stale-lock-removal path entirely and throw ERR_LOCK_TIMEOUT instead. Callers already swallow the error via .catch() in the write queue, so the only effect is that the write is skipped rather than risking data loss through a race.	2026-03-14 16:35:39 +00:00
jiarung	13b0c1d010	fix(usage-log): reacquire lock via O_EXCL after timeout instead of running unlocked After the retry loop timed out, withFileLock unconditionally deleted the lock file and called fn() without reacquiring the lock. If multiple waiters timed out concurrently they would all enter the critical section together, defeating the serialisation guarantee and allowing concurrent read-modify-write cycles to overwrite each other's records. Fix: after unlinking the stale lock, attempt one final O_EXCL open so that exactly one concurrent waiter wins the lock and the rest receive ERR_LOCK_TIMEOUT. The unlocked fast-path is removed entirely.	2026-03-13 23:41:25 +00:00
jiarung	a7a7923d09	fix(usage-log): reject non-array token logs instead of resetting history readJsonArray treated any valid JSON that is not an array as [], causing appendRecord to overwrite the file with only the new entry — silently deleting all prior data. This is the same data-loss mode the malformed-JSON fix was trying to prevent. Fix: throw ERR_UNEXPECTED_TOKEN_LOG_SHAPE when parsed JSON is not an array so appendRecord aborts and the existing file is preserved.	2026-03-13 23:35:12 +00:00
jiarung	f267ff7888	fix(usage-log): add cross-process file lock to prevent write races The in-memory writeQueues Map serialises writes within one Node process but two concurrent OpenClaw processes sharing the same workspaceDir (e.g. parallel CLI runs) can still race: both read the same snapshot before either writes, and the later writer silently overwrites the earlier entry. Add withFileLock() — an O_EXCL advisory lock on <file>.lock — to coordinate across processes. The per-file in-memory queue is kept to reduce lock contention within the same process. On lock-acquire failure the helper retries every 50 ms up to a 5 s timeout; on timeout it removes a potentially stale lock file and makes one final attempt to prevent permanent blocking after a crash.	2026-03-13 16:04:22 +00:00
jiarung	386dbb010e	fix(git-hooks,usage-log): fix two CI failures pre-commit: guard the resolve-node.sh source with a file-existence check so the hook works in test environments that stub only the files they care about (the integration test creates run-node-tool.sh but not resolve-node.sh; node is provided via a fake binary in PATH so the nvm fallback is never needed in that context). usage-log: replace Math.random() in makeId() with crypto.randomBytes() to satisfy the temp-path-guard security lint rule that rejects weak randomness in source files.	2026-03-13 14:57:25 +00:00
jiarung	020001d9b2	fix(usage-log): propagate non-ENOENT read errors to prevent silent data loss readJsonArray previously caught all errors and returned [], so a malformed token-usage.json (e.g. from an interrupted writeFile) caused the next recordTokenUsage call to overwrite the file with only the new entry, permanently erasing all prior records. Fix: only suppress ENOENT (file not yet created). Any other error (SyntaxError, EACCES, …) is re-thrown so appendRecord aborts and the existing file is left intact. The write-queue slot still absorbs the rejection via .catch() so future writes are not stalled; callers that need to observe the failure (e.g. attempt.ts) can attach their own .catch() handler.	2026-03-13 14:25:13 +00:00
jiarung	cece47f490	fix(usage-log): remove redundant taskId field from TokenUsageRecord taskId was set to params.runId, the same value already stored in the runId field, giving downstream consumers two identical fields with different names. Remove taskId from the type and the entry constructor to avoid confusion.	2026-03-13 09:40:24 +00:00
jiarung	d03e7ae8ed	fix(usage-log): serialise concurrent writes with per-file promise queue Fire-and-forget callers (attempt.ts) can trigger two concurrent recordTokenUsage() calls for the same workspaceDir. The previous read-modify-write pattern had no locking, so the last writer silently overwrote the first, losing that run's entry. Fix: keep a Map<file, Promise<void>> write queue so each write awaits the previous one. The queue slot is replaced with a no-throw wrapper so a failed write does not stall future writes. Added a concurrent-write test (20 parallel calls) that asserts no record is lost.	2026-03-13 09:37:25 +00:00
jiarung	b5cf5aa59f	fix(usage-log): add curly braces to satisfy oxlint curly rule	2026-03-13 09:00:56 +00:00
jiarung	8cbc05ae1f	feat(usage-log): record inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens The recordTokenUsage function previously only persisted the aggregate tokensUsed total, discarding the input/output breakdown that was already available via getUsageTotals(). This meant token-usage.json had no per-record IO split, making it impossible to analyse input vs output token costs in dashboards. Changes: - Add inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens optional fields to TokenUsageRecord type in usage-log.ts (new file) - Write these fields (when non-zero) into each usage entry - Fields are omitted (not null) when unavailable, keeping existing records valid - Wire up recordTokenUsage() call in attempt.ts after llm_output hook This is a purely additive change; existing consumers that only read tokensUsed are unaffected.	2026-03-13 06:35:38 +00:00

10 Commits