Does Hermes require a built-in Chutes provider?

No. Use the custom OpenAI-compatible provider shape from the toolkit.

Can Hermes use default:latency with no setup?

No. default:latency requires a saved Model Routing pool. Use a concrete model ID or inline pool for zero setup.

Hermes

Run Hermes through Chutes.

Hermes can call Chutes as a named OpenAI-compatible provider. Keep CHUTES_API_KEY in the environment, discover models live, and use routing aliases only after the dashboard pool exists.

Open Hermes guide Connect page

Hermes draft Basic config source Cheap routing source Hermes recipes

Quick config

The minimum to point this client at Chutes. Every value below is rendered from the live catalog or is a stable endpoint fact — copy it and go.

Zero-setup Hermes config

yaml

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    discover_models: true
    models:
      "Qwen/Qwen3-32B-TEE": {}

model:
  provider: custom:chutes
  default: "Qwen/Qwen3-32B-TEE"

Provider URL

https://llm.chutes.ai/v1

Auth env

CHUTES_API_KEY=cpk_...

Live model example

Qwen/Qwen3-32B-TEE

Catalog checked

Jun 25, 2026

60-second setup

Store the key

Put CHUTES_API_KEY in the Hermes environment or secret store, not in checked-in YAML.

Add the provider

Use base_url https://llm.chutes.ai/v1 with chat_completions transport.

Pick a model path

Use a live model ID first. Add default:latency only after a Model Routing pool exists.

Toolkit configs and routing caveat

The toolkit config shows saved aliases. Treat them as post-setup examples: the zero setup path is a concrete model ID from the live catalog.

Inline pool from current catalog

yaml

# Live-picked inline pool, no saved alias required.
default_model: "unsloth/Mistral-Nemo-Instruct-2407-TEE,google/gemma-4-31B-turbo-TEE,Qwen/Qwen3-32B-TEE:latency"

Source-copied Hermes config

yaml

# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-basic.yaml
# ~/.hermes/config.yaml
# Chutes as a named Hermes provider entry.
# Put CHUTES_API_KEY=*** in ~/.hermes/.env, not in this file.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:latency
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

model:
  provider: custom:chutes
  default: default:latency

Hermes prompt lanes

Prompt

Use Chutes for this chat and show the chosen model.

Hermes uses

The chutes provider entry plus live model discovery.

Prompt

Pick a cheap TEE-backed pool for agentic work.

Hermes uses

The same picker rules as the toolkit, then an inline routing string.

Cheap routing example

This source example includes routing aliases. Keep the dashboard caveat visible anywhere it is rendered.

Open source

Toolkit cheap routing config

yaml

# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-cheap-routing.yaml
# ~/.hermes/config.yaml
# Keep your primary model/provider elsewhere, and let Hermes route cheap/simple work to Chutes.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:latency
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

smart_model_routing:
  enabled: true
  cheap_model:
    provider: custom:chutes
    model: default:latency

Troubleshooting

Symptom

Hermes cannot resolve default:latency

Likely cause

The account has no saved default routing pool.

Fix

Use a concrete model ID or configure Model Routing once in the dashboard.

Symptom

Completions hit anonymous rate limits

Likely cause

The key is missing or sent as X-API-Key.

Fix

Set CHUTES_API_KEY and send Bearer auth through the provider.

Symptom

Feature-specific prompt fails

Likely cause

The selected model may not expose that feature.

Fix

Read supported_features from /v1/models before selecting the model.