Hermes

Run Hermes through Chutes.

Hermes can call Chutes as a named OpenAI-compatible provider. Keep CHUTES_API_KEY in the environment, discover models live, and use routing aliases only after the dashboard pool exists.
Hermes draftBasic config sourceCheap routing sourceHermes recipes

Quick config

The minimum to point this client at Chutes. Every value below is rendered from the live catalog or is a stable endpoint fact — copy it and go.

Zero-setup Hermes config
yaml
providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    discover_models: true
    models:
      "Qwen/Qwen3-32B-TEE": {}

model:
  provider: custom:chutes
  default: "Qwen/Qwen3-32B-TEE"
Provider URL
https://llm.chutes.ai/v1
Auth env
CHUTES_API_KEY=cpk_...
Live model example
Qwen/Qwen3-32B-TEE
Catalog checked
Jun 25, 2026

60-second setup

01

Store the key

Put CHUTES_API_KEY in the Hermes environment or secret store, not in checked-in YAML.
02

Add the provider

Use base_url https://llm.chutes.ai/v1 with chat_completions transport.
03

Pick a model path

Use a live model ID first. Add default:latency only after a Model Routing pool exists.

Toolkit configs and routing caveat

The toolkit config shows saved aliases. Treat them as post-setup examples: the zero setup path is a concrete model ID from the live catalog.

Inline pool from current catalog
yaml
# Live-picked inline pool, no saved alias required.
default_model: "unsloth/Mistral-Nemo-Instruct-2407-TEE,google/gemma-4-31B-turbo-TEE,Qwen/Qwen3-32B-TEE:latency"
Source-copied Hermes config
yaml
# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-basic.yaml
# ~/.hermes/config.yaml
# Chutes as a named Hermes provider entry.
# Put CHUTES_API_KEY=*** in ~/.hermes/.env, not in this file.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:latency
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

model:
  provider: custom:chutes
  default: default:latency

Hermes prompt lanes

Prompt
Use Chutes for this chat and show the chosen model.
Hermes uses
The chutes provider entry plus live model discovery.
Prompt
Pick a cheap TEE-backed pool for agentic work.
Hermes uses
The same picker rules as the toolkit, then an inline routing string.

Cheap routing example

This source example includes routing aliases. Keep the dashboard caveat visible anywhere it is rendered.

Open source
Toolkit cheap routing config
yaml
# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-cheap-routing.yaml
# ~/.hermes/config.yaml
# Keep your primary model/provider elsewhere, and let Hermes route cheap/simple work to Chutes.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:latency
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

smart_model_routing:
  enabled: true
  cheap_model:
    provider: custom:chutes
    model: default:latency

Troubleshooting

Symptom
Hermes cannot resolve default:latency
Likely cause
The account has no saved default routing pool.
Fix
Use a concrete model ID or configure Model Routing once in the dashboard.
Symptom
Completions hit anonymous rate limits
Likely cause
The key is missing or sent as X-API-Key.
Fix
Set CHUTES_API_KEY and send Bearer auth through the provider.
Symptom
Feature-specific prompt fails
Likely cause
The selected model may not expose that feature.
Fix
Read supported_features from /v1/models before selecting the model.