openclaw/docs/providers/huggingface.md

---
summary: "Hugging Face Inference setup (auth + model selection)"
read_when:
  - You want to use Hugging Face Inference with OpenClaw
  - You need the HF token env var or CLI auth choice
title: "Hugging Face (Inference)"
---

# Hugging Face (Inference)

[Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers) offer OpenAI-compatible chat completions through a single router API. You get access to many models (DeepSeek, Llama, and more) with one token. OpenClaw uses the **OpenAI-compatible endpoint** (chat completions only); for text-to-image, embeddings, or speech use the [HF inference clients](https://huggingface.co/docs/api-inference/quicktour) directly.

- Provider: `huggingface`
- Auth: `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` (fine-grained token with **Make calls to Inference Providers**)
- API: OpenAI-compatible (`https://router.huggingface.co/v1`)
- Billing: Single HF token; [pricing](https://huggingface.co/docs/inference-providers/pricing) follows provider rates with a free tier.

## Quick start

1. Create a fine-grained token at [Hugging Face → Settings → Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) with the **Make calls to Inference Providers** permission.
2. Run onboarding and choose **Hugging Face** in the provider dropdown, then enter your API key when prompted:

```bash
openclaw onboard --auth-choice huggingface-api-key
```

3. In the **Default Hugging Face model** dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model.
4. You can also set or change the default model later in config:

```json5
{
  agents: {
    defaults: {
      model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },
    },
  },
}
```

## Non-interactive example

```bash
openclaw onboard --non-interactive \
  --mode local \
  --auth-choice huggingface-api-key \
  --huggingface-api-key "$HF_TOKEN"
```

This will set `huggingface/deepseek-ai/DeepSeek-R1` as the default model.

## Environment note

If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`
is available to that process (for example, in `~/.openclaw/.env` or via
`env.shellEnv`).

## Model discovery and onboarding dropdown

OpenClaw discovers models by calling the **Inference endpoint directly**:

```bash
GET https://router.huggingface.co/v1/models
```

(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.

When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.

## Model names and editable options

- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` → “DeepSeek R1”).
- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI:

```json5
{
  agents: {
    defaults: {
      models: {
        "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },
        "huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },
      },
    },
  },
}
```

- **Provider / policy selection:** Append a suffix to the **model id** to choose how the router picks the backend:
  - **`:fastest`** — highest throughput (router picks; provider choice is **locked** — no interactive backend picker).
  - **`:cheapest`** — lowest cost per output token (router picks; provider choice is **locked**).
  - **`:provider`** — force a specific backend (e.g. `:sambanova`, `:together`).

  When you select **:cheapest** or **:fastest** (e.g. in the onboarding model dropdown), the provider is locked: the router decides by cost or speed and no optional “prefer specific backend” step is shown. You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).

- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.

## Model IDs and configuration examples

Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list below is from **GET** `https://router.huggingface.co/v1/models`; your catalog may include more.

**Example IDs (from the inference endpoint):**

| Model                  | Ref (prefix with `huggingface/`)    |
| ---------------------- | ----------------------------------- |
| DeepSeek R1            | `deepseek-ai/DeepSeek-R1`           |
| DeepSeek V3.2          | `deepseek-ai/DeepSeek-V3.2`         |
| Qwen3 8B               | `Qwen/Qwen3-8B`                     |
| Qwen2.5 7B Instruct    | `Qwen/Qwen2.5-7B-Instruct`          |
| Qwen3 32B              | `Qwen/Qwen3-32B`                    |
| Llama 3.3 70B Instruct | `meta-llama/Llama-3.3-70B-Instruct` |
| Llama 3.1 8B Instruct  | `meta-llama/Llama-3.1-8B-Instruct`  |
| GPT-OSS 120B           | `openai/gpt-oss-120b`               |
| GLM 4.7                | `zai-org/GLM-4.7`                   |
| Kimi K2.5              | `moonshotai/Kimi-K2.5`              |

You can append `:fastest`, `:cheapest`, or `:provider` (e.g. `:together`, `:sambanova`) to the model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list.

### Complete configuration examples

**Primary DeepSeek R1 with Qwen fallback:**

```json5
{
  agents: {
    defaults: {
      model: {
        primary: "huggingface/deepseek-ai/DeepSeek-R1",
        fallbacks: ["huggingface/Qwen/Qwen3-8B"],
      },
      models: {
        "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },
        "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
      },
    },
  },
}
```

**Qwen as default, with :cheapest and :fastest variants:**

```json5
{
  agents: {
    defaults: {
      model: { primary: "huggingface/Qwen/Qwen3-8B" },
      models: {
        "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
        "huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },
        "huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },
      },
    },
  },
}
```

**DeepSeek + Llama + GPT-OSS with aliases:**

```json5
{
  agents: {
    defaults: {
      model: {
        primary: "huggingface/deepseek-ai/DeepSeek-V3.2",
        fallbacks: [
          "huggingface/meta-llama/Llama-3.3-70B-Instruct",
          "huggingface/openai/gpt-oss-120b",
        ],
      },
      models: {
        "huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },
        "huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },
        "huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },
      },
    },
  },
}
```

**Force a specific backend with :provider:**

```json5
{
  agents: {
    defaults: {
      model: { primary: "huggingface/deepseek-ai/DeepSeek-R1:together" },
      models: {
        "huggingface/deepseek-ai/DeepSeek-R1:together": { alias: "DeepSeek R1 (Together)" },
      },
    },
  },
}
```

**Multiple Qwen and DeepSeek models with policy suffixes:**

```json5
{
  agents: {
    defaults: {
      model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },
      models: {
        "huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },
        "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },
        "huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },
        "huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },
      },
    },
  },
}
```
feat(agents) : Hugging Face Inference provider first-class support and Together API fix and Direct Injection Refactor Auths [AI-assisted] (#13472) * initial commit * removes assesment from docs * resolves automated review comments * resolves lint , type , tests , refactors , and submits * solves : why do we have to lint the tests xD * adds greptile fixes * solves a type error * solves a ci error * refactors auths * solves a failing test after i pulled from main lol * solves a failing test after i pulled from main lol * resolves token naming issue to comply with better practices when using hf / huggingface * fixes curly lints ! * fixes failing tests for google api from main * solve merge conflicts * solve failing tests with a defensive check 'undefined' openrouterapi key * fix: preserve Hugging Face auth-choice intent and token behavior (#13472) (thanks @Josephrp) * test: resolve auth-choice cherry-pick conflict cleanup (#13472) --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> 2026-02-13 16:18:16 +01:00			`---`
			`summary: "Hugging Face Inference setup (auth + model selection)"`
			`read_when:`
			`- You want to use Hugging Face Inference with OpenClaw`
			`- You need the HF token env var or CLI auth choice`
			`title: "Hugging Face (Inference)"`
			`---`

			`# Hugging Face (Inference)`

			`[Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers) offer OpenAI-compatible chat completions through a single router API. You get access to many models (DeepSeek, Llama, and more) with one token. OpenClaw uses the OpenAI-compatible endpoint (chat completions only); for text-to-image, embeddings, or speech use the [HF inference clients](https://huggingface.co/docs/api-inference/quicktour) directly.`

			- Provider: `huggingface`
			- Auth: `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` (fine-grained token with Make calls to Inference Providers)
			- API: OpenAI-compatible (`https://router.huggingface.co/v1`)
			`- Billing: Single HF token; [pricing](https://huggingface.co/docs/inference-providers/pricing) follows provider rates with a free tier.`

			`## Quick start`

			`1. Create a fine-grained token at [Hugging Face → Settings → Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) with the Make calls to Inference Providers permission.`
			`2. Run onboarding and choose Hugging Face in the provider dropdown, then enter your API key when prompted:`

			```bash
docs: restore onboard docs references 2026-03-16 05:50:48 +00:00			`openclaw onboard --auth-choice huggingface-api-key`
feat(agents) : Hugging Face Inference provider first-class support and Together API fix and Direct Injection Refactor Auths [AI-assisted] (#13472) * initial commit * removes assesment from docs * resolves automated review comments * resolves lint , type , tests , refactors , and submits * solves : why do we have to lint the tests xD * adds greptile fixes * solves a type error * solves a ci error * refactors auths * solves a failing test after i pulled from main lol * solves a failing test after i pulled from main lol * resolves token naming issue to comply with better practices when using hf / huggingface * fixes curly lints ! * fixes failing tests for google api from main * solve merge conflicts * solve failing tests with a defensive check 'undefined' openrouterapi key * fix: preserve Hugging Face auth-choice intent and token behavior (#13472) (thanks @Josephrp) * test: resolve auth-choice cherry-pick conflict cleanup (#13472) --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> 2026-02-13 16:18:16 +01:00			```

			`3. In the Default Hugging Face model dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model.`
			`4. You can also set or change the default model later in config:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },`
			`},`
			`},`
			`}`
			```

			`## Non-interactive example`

			```bash
docs: restore onboard docs references 2026-03-16 05:50:48 +00:00			`openclaw onboard --non-interactive \`
feat(agents) : Hugging Face Inference provider first-class support and Together API fix and Direct Injection Refactor Auths [AI-assisted] (#13472) * initial commit * removes assesment from docs * resolves automated review comments * resolves lint , type , tests , refactors , and submits * solves : why do we have to lint the tests xD * adds greptile fixes * solves a type error * solves a ci error * refactors auths * solves a failing test after i pulled from main lol * solves a failing test after i pulled from main lol * resolves token naming issue to comply with better practices when using hf / huggingface * fixes curly lints ! * fixes failing tests for google api from main * solve merge conflicts * solve failing tests with a defensive check 'undefined' openrouterapi key * fix: preserve Hugging Face auth-choice intent and token behavior (#13472) (thanks @Josephrp) * test: resolve auth-choice cherry-pick conflict cleanup (#13472) --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> 2026-02-13 16:18:16 +01:00			`--mode local \`
			`--auth-choice huggingface-api-key \`
			`--huggingface-api-key "$HF_TOKEN"`
			```

			This will set `huggingface/deepseek-ai/DeepSeek-R1` as the default model.

			`## Environment note`

			If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`
			is available to that process (for example, in `~/.openclaw/.env` or via
			`env.shellEnv`).

			`## Model discovery and onboarding dropdown`

			`OpenClaw discovers models by calling the Inference endpoint directly:`

			```bash
			`GET https://router.huggingface.co/v1/models`
			```

			(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.

docs: update setup wizard wording 2026-03-15 21:39:36 -07:00			When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During interactive setup, after you enter your token you see a Default Hugging Face model dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls GET `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
feat(agents) : Hugging Face Inference provider first-class support and Together API fix and Direct Injection Refactor Auths [AI-assisted] (#13472) * initial commit * removes assesment from docs * resolves automated review comments * resolves lint , type , tests , refactors , and submits * solves : why do we have to lint the tests xD * adds greptile fixes * solves a type error * solves a ci error * refactors auths * solves a failing test after i pulled from main lol * solves a failing test after i pulled from main lol * resolves token naming issue to comply with better practices when using hf / huggingface * fixes curly lints ! * fixes failing tests for google api from main * solve merge conflicts * solve failing tests with a defensive check 'undefined' openrouterapi key * fix: preserve Hugging Face auth-choice intent and token behavior (#13472) (thanks @Josephrp) * test: resolve auth-choice cherry-pick conflict cleanup (#13472) --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> 2026-02-13 16:18:16 +01:00
			`## Model names and editable options`

			- Name from API: The model display name is hydrated from GET /v1/models when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` → “DeepSeek R1”).
			`- Override display name: You can set a custom label per model in config so it appears the way you want in the CLI and UI:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`models: {`
			`"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },`
			`"huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },`
			`},`
			`},`
			`},`
			`}`
			```

			`- Provider / policy selection: Append a suffix to the model id to choose how the router picks the backend:`
			- `:fastest` — highest throughput (router picks; provider choice is locked — no interactive backend picker).
			- `:cheapest` — lowest cost per output token (router picks; provider choice is locked).
			- `:provider` — force a specific backend (e.g. `:sambanova`, `:together`).

			When you select :cheapest or :fastest (e.g. in the onboarding model dropdown), the provider is locked: the router decides by cost or speed and no optional “prefer specific backend” step is shown. You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).

			- Config merge: Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.

			`## Model IDs and configuration examples`

			Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list below is from GET `https://router.huggingface.co/v1/models`; your catalog may include more.

			`Example IDs (from the inference endpoint):`

			\| Model \| Ref (prefix with `huggingface/`) \|
			`\| ---------------------- \| ----------------------------------- \|`
			\| DeepSeek R1 \| `deepseek-ai/DeepSeek-R1` \|
			\| DeepSeek V3.2 \| `deepseek-ai/DeepSeek-V3.2` \|
			\| Qwen3 8B \| `Qwen/Qwen3-8B` \|
			\| Qwen2.5 7B Instruct \| `Qwen/Qwen2.5-7B-Instruct` \|
			\| Qwen3 32B \| `Qwen/Qwen3-32B` \|
			\| Llama 3.3 70B Instruct \| `meta-llama/Llama-3.3-70B-Instruct` \|
			\| Llama 3.1 8B Instruct \| `meta-llama/Llama-3.1-8B-Instruct` \|
			\| GPT-OSS 120B \| `openai/gpt-oss-120b` \|
			\| GLM 4.7 \| `zai-org/GLM-4.7` \|
			\| Kimi K2.5 \| `moonshotai/Kimi-K2.5` \|

			You can append `:fastest`, `:cheapest`, or `:provider` (e.g. `:together`, `:sambanova`) to the model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and GET `https://router.huggingface.co/v1/models` for the full list.

			`### Complete configuration examples`

			`Primary DeepSeek R1 with Qwen fallback:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: {`
			`primary: "huggingface/deepseek-ai/DeepSeek-R1",`
			`fallbacks: ["huggingface/Qwen/Qwen3-8B"],`
			`},`
			`models: {`
			`"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },`
			`"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },`
			`},`
			`},`
			`},`
			`}`
			```

			`Qwen as default, with :cheapest and :fastest variants:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: { primary: "huggingface/Qwen/Qwen3-8B" },`
			`models: {`
			`"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },`
			`"huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },`
			`"huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },`
			`},`
			`},`
			`},`
			`}`
			```

			`DeepSeek + Llama + GPT-OSS with aliases:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: {`
			`primary: "huggingface/deepseek-ai/DeepSeek-V3.2",`
			`fallbacks: [`
			`"huggingface/meta-llama/Llama-3.3-70B-Instruct",`
			`"huggingface/openai/gpt-oss-120b",`
			`],`
			`},`
			`models: {`
			`"huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },`
			`"huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },`
			`"huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },`
			`},`
			`},`
			`},`
			`}`
			```

			`Force a specific backend with :provider:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: { primary: "huggingface/deepseek-ai/DeepSeek-R1:together" },`
			`models: {`
			`"huggingface/deepseek-ai/DeepSeek-R1:together": { alias: "DeepSeek R1 (Together)" },`
			`},`
			`},`
			`},`
			`}`
			```

			`Multiple Qwen and DeepSeek models with policy suffixes:`

			```json5
			`{`
			`agents: {`
			`defaults: {`
			`model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },`
			`models: {`
			`"huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },`
			`"huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },`
			`"huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },`
			`"huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },`
			`},`
			`},`
			`},`
			`}`
			```