diff --git a/CHANGELOG.md b/CHANGELOG.md index 9343a5d34a4..2b33d3f4947 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -40,6 +40,7 @@ Docs: https://docs.openclaw.ai - Browser/Chrome MCP: remove the legacy Chrome extension relay path, bundled extension assets, `driver: "extension"`, and `browser.relayBindHost`. Run `openclaw doctor --fix` to migrate host-local browser config to `existing-session` / `user`; Docker, headless, sandbox, and remote browser flows still use raw CDP. (#47893) Thanks @vincentkoc. - Plugins/runtime: remove the public `openclaw/extension-api` surface with no compatibility shim. Bundled plugins must use injected runtime for host-side operations (for example `api.runtime.agent.runEmbeddedPiAgent`) and any remaining direct imports must come from narrow `openclaw/plugin-sdk/*` subpaths instead of the monolithic SDK root. +- Tools/image generation: standardize the stock image create/edit path on the core `image_generate` tool. The old `nano-banana-pro` docs/examples are gone; if you previously copied that sample-skill config, switch to `agents.defaults.imageGenerationModel` for built-in image generation or install a separate third-party skill explicitly. ### Fixes diff --git a/docs/gateway/configuration-examples.md b/docs/gateway/configuration-examples.md index 9767f2db674..5627f93395d 100644 --- a/docs/gateway/configuration-examples.md +++ b/docs/gateway/configuration-examples.md @@ -434,7 +434,7 @@ Save to `~/.openclaw/openclaw.json` and you can DM the bot from that number. nodeManager: "npm", }, entries: { - "nano-banana-pro": { + "image-lab": { enabled: true, apiKey: "GEMINI_KEY_HERE", env: { GEMINI_API_KEY: "GEMINI_KEY_HERE" }, diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index 9085c9c35f5..0913040949b 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -2371,7 +2371,7 @@ See [Local Models](/gateway/local-models). TL;DR: run MiniMax M2.5 via LM Studio nodeManager: "npm", // npm | pnpm | yarn }, entries: { - "nano-banana-pro": { + "image-lab": { apiKey: { source: "env", provider: "default", id: "GEMINI_API_KEY" }, // or plaintext string env: { GEMINI_API_KEY: "GEMINI_KEY_HERE" }, }, diff --git a/docs/gateway/configuration.md b/docs/gateway/configuration.md index 3ead49f6817..d15efb3384b 100644 --- a/docs/gateway/configuration.md +++ b/docs/gateway/configuration.md @@ -597,11 +597,11 @@ Rules: }, skills: { entries: { - "nano-banana-pro": { + "image-lab": { apiKey: { source: "file", provider: "filemain", - id: "/skills/entries/nano-banana-pro/apiKey", + id: "/skills/entries/image-lab/apiKey", }, }, }, diff --git a/docs/help/testing.md b/docs/help/testing.md index f3315fa6faa..2055db4373f 100644 --- a/docs/help/testing.md +++ b/docs/help/testing.md @@ -360,14 +360,29 @@ If you want to rely on env keys (e.g. exported in your `~/.profile`), run local - Enable: `BYTEPLUS_API_KEY=... BYTEPLUS_LIVE_TEST=1 pnpm test:live src/agents/byteplus.live.test.ts` - Optional model override: `BYTEPLUS_CODING_MODEL=ark-code-latest` -## Google image generation live +## Image generation live -- Test: `src/image-generation/providers/google.live.test.ts` -- Enable: `GOOGLE_LIVE_TEST=1 pnpm test:live src/image-generation/providers/google.live.test.ts` -- Key source: `GEMINI_API_KEY` or `GOOGLE_API_KEY` -- Optional overrides: - - `GOOGLE_IMAGE_GENERATION_MODEL=gemini-3.1-flash-image-preview` - - `GOOGLE_IMAGE_BASE_URL=https://generativelanguage.googleapis.com/v1beta` +- Test: `src/image-generation/runtime.live.test.ts` +- Command: `pnpm test:live src/image-generation/runtime.live.test.ts` +- Scope: + - Enumerates every registered image-generation provider plugin + - Loads missing provider env vars from your login shell (`~/.profile`) before probing + - Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials + - Skips providers with no usable auth/profile/model + - Runs the stock image-generation variants through the shared runtime capability: + - `google:flash-generate` + - `google:pro-generate` + - `google:pro-edit` + - `openai:default-generate` +- Current bundled providers covered: + - `openai` + - `google` +- Optional narrowing: + - `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="openai,google"` + - `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-1,google/gemini-3.1-flash-image-preview"` + - `OPENCLAW_LIVE_IMAGE_GENERATION_CASES="google:flash-generate,google:pro-edit"` +- Optional auth behavior: + - `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides ## Docker runners (optional “works in Linux” checks) diff --git a/docs/tools/index.md b/docs/tools/index.md index deb42b0d76a..1dfe2b87703 100644 --- a/docs/tools/index.md +++ b/docs/tools/index.md @@ -400,6 +400,30 @@ Notes: - Only available when `agents.defaults.imageModel` is configured (primary or fallbacks), or when an implicit image model can be inferred from your default model + configured auth (best-effort pairing). - Uses the image model directly (independent of the main chat model). +### `image_generate` + +Generate one or more images with the configured image-generation model. + +Core parameters: + +- `action` (optional: `generate` or `list`; default `generate`) +- `prompt` (required) +- `image` or `images` (optional reference image path/URL for edit mode) +- `model` (optional provider/model override) +- `size` (optional size hint) +- `resolution` (optional `1K|2K|4K` hint) +- `count` (optional, `1-4`, default `1`) + +Notes: + +- Only available when `agents.defaults.imageGenerationModel` is configured. +- Use `action: "list"` to inspect registered providers, default models, supported model ids, sizes, resolutions, and edit support. +- Returns local `MEDIA:` lines so channels can deliver the generated files directly. +- Uses the image-generation model directly (independent of the main chat model). +- Google-backed flows support reference-image edits plus explicit `1K|2K|4K` resolution hints. +- When editing and `resolution` is omitted, OpenClaw infers a draft/final resolution from the input image size. +- This is the built-in replacement for the old sample `nano-banana-pro` skill workflow. Use `agents.defaults.imageGenerationModel`, not `skills.entries`, for stock image generation. + ### `pdf` Analyze one or more PDF documents. diff --git a/docs/tools/skills-config.md b/docs/tools/skills-config.md index 589d464bb13..697cb46dad6 100644 --- a/docs/tools/skills-config.md +++ b/docs/tools/skills-config.md @@ -24,7 +24,7 @@ All skills-related configuration lives under `skills` in `~/.openclaw/openclaw.j nodeManager: "npm", // npm | pnpm | yarn | bun (Gateway runtime still Node; bun not recommended) }, entries: { - "nano-banana-pro": { + "image-lab": { enabled: true, apiKey: { source: "env", provider: "default", id: "GEMINI_API_KEY" }, // or plaintext string env: { @@ -38,6 +38,10 @@ All skills-related configuration lives under `skills` in `~/.openclaw/openclaw.j } ``` +For built-in image generation/editing, prefer `agents.defaults.imageGenerationModel` +plus the core `image_generate` tool. `skills.entries.*` is only for custom or +third-party skill workflows. + ## Fields - `allowBundled`: optional allowlist for **bundled** skills only. When set, only diff --git a/docs/tools/skills.md b/docs/tools/skills.md index 05369677b89..5b91d79af59 100644 --- a/docs/tools/skills.md +++ b/docs/tools/skills.md @@ -81,8 +81,8 @@ that up as `/skills` on the next session. ```markdown --- -name: nano-banana-pro -description: Generate or edit images via Gemini 3 Pro Image +name: image-lab +description: Generate or edit images via a provider-backed image workflow --- ``` @@ -109,8 +109,8 @@ OpenClaw **filters skills at load time** using `metadata` (single-line JSON): ```markdown --- -name: nano-banana-pro -description: Generate or edit images via Gemini 3 Pro Image +name: image-lab +description: Generate or edit images via a provider-backed image workflow metadata: { "openclaw": @@ -194,7 +194,7 @@ Bundled/managed skills can be toggled and supplied with env values: { skills: { entries: { - "nano-banana-pro": { + "image-lab": { enabled: true, apiKey: { source: "env", provider: "default", id: "GEMINI_API_KEY" }, // or plaintext string env: { @@ -214,6 +214,10 @@ Bundled/managed skills can be toggled and supplied with env values: Note: if the skill name contains hyphens, quote the key (JSON5 allows quoted keys). +If you want stock image generation/editing inside OpenClaw itself, use the core +`image_generate` tool with `agents.defaults.imageGenerationModel` instead of a +bundled skill. Skill examples here are for custom or third-party workflows. + Config keys match the **skill name** by default. If a skill defines `metadata.openclaw.skillKey`, use that key under `skills.entries`. diff --git a/src/agents/skills.build-workspace-skills-prompt.syncs-merged-skills-into-target-workspace.test.ts b/src/agents/skills.build-workspace-skills-prompt.syncs-merged-skills-into-target-workspace.test.ts index 1f4da5163e1..b09571f540f 100644 --- a/src/agents/skills.build-workspace-skills-prompt.syncs-merged-skills-into-target-workspace.test.ts +++ b/src/agents/skills.build-workspace-skills-prompt.syncs-merged-skills-into-target-workspace.test.ts @@ -200,30 +200,30 @@ describe("buildWorkspaceSkillsPrompt", () => { }); it("filters skills based on env/config gates", async () => { const workspaceDir = await createCaseDir("workspace"); - const skillDir = path.join(workspaceDir, "skills", "nano-banana-pro"); + const skillDir = path.join(workspaceDir, "skills", "image-lab"); await writeSkill({ dir: skillDir, - name: "nano-banana-pro", + name: "image-lab", description: "Generates images", metadata: '{"openclaw":{"requires":{"env":["GEMINI_API_KEY"]},"primaryEnv":"GEMINI_API_KEY"}}', - body: "# Nano Banana\n", + body: "# Image Lab\n", }); withEnv({ GEMINI_API_KEY: undefined }, () => { const missingPrompt = buildPrompt(workspaceDir, { managedSkillsDir: path.join(workspaceDir, ".managed"), - config: { skills: { entries: { "nano-banana-pro": { apiKey: "" } } } }, + config: { skills: { entries: { "image-lab": { apiKey: "" } } } }, }); - expect(missingPrompt).not.toContain("nano-banana-pro"); + expect(missingPrompt).not.toContain("image-lab"); const enabledPrompt = buildPrompt(workspaceDir, { managedSkillsDir: path.join(workspaceDir, ".managed"), config: { - skills: { entries: { "nano-banana-pro": { apiKey: "test-key" } } }, // pragma: allowlist secret + skills: { entries: { "image-lab": { apiKey: "test-key" } } }, // pragma: allowlist secret }, }); - expect(enabledPrompt).toContain("nano-banana-pro"); + expect(enabledPrompt).toContain("image-lab"); }); }); it("applies skill filters, including empty lists", async () => {