Hermes
Run Hermes through Chutes.
Hermes can call Chutes as a named OpenAI-compatible provider. Keep
CHUTES_API_KEY in the environment, discover models live, and use routing aliases only after the dashboard pool exists.Quick config
The minimum to point this client at Chutes. Every value below is rendered from the live catalog or is a stable endpoint fact — copy it and go.
Zero-setup Hermes config
yaml
providers:
chutes:
name: Chutes
base_url: https://llm.chutes.ai/v1
key_env: CHUTES_API_KEY
transport: chat_completions
discover_models: true
models:
"Qwen/Qwen3-32B-TEE": {}
model:
provider: custom:chutes
default: "Qwen/Qwen3-32B-TEE"Provider URL
https://llm.chutes.ai/v1
Auth env
CHUTES_API_KEY=cpk_...
Live model example
Qwen/Qwen3-32B-TEE
Catalog checked
Jun 25, 2026
60-second setup
01
Store the key
Put CHUTES_API_KEY in the Hermes environment or secret store, not in checked-in YAML.
02
Add the provider
Use base_url https://llm.chutes.ai/v1 with chat_completions transport.
03
Pick a model path
Use a live model ID first. Add default:latency only after a Model Routing pool exists.
Toolkit configs and routing caveat
The toolkit config shows saved aliases. Treat them as post-setup examples: the zero setup path is a concrete model ID from the live catalog.
Inline pool from current catalog
yaml
# Live-picked inline pool, no saved alias required.
default_model: "unsloth/Mistral-Nemo-Instruct-2407-TEE,google/gemma-4-31B-turbo-TEE,Qwen/Qwen3-32B-TEE:latency"Source-copied Hermes config
yaml
# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-basic.yaml
# ~/.hermes/config.yaml
# Chutes as a named Hermes provider entry.
# Put CHUTES_API_KEY=*** in ~/.hermes/.env, not in this file.
providers:
chutes:
name: Chutes
base_url: https://llm.chutes.ai/v1
key_env: CHUTES_API_KEY
transport: chat_completions
default_model: default:latency
discover_models: true
models:
default: {}
"default:latency": {}
"default:throughput": {}
model:
provider: custom:chutes
default: default:latency
Hermes prompt lanes
Prompt
Use Chutes for this chat and show the chosen model.
Hermes uses
The chutes provider entry plus live model discovery.
Prompt
Pick a cheap TEE-backed pool for agentic work.
Hermes uses
The same picker rules as the toolkit, then an inline routing string.
Cheap routing example
This source example includes routing aliases. Keep the dashboard caveat visible anywhere it is rendered.
Open sourceToolkit cheap routing config
yaml
# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-cheap-routing.yaml
# ~/.hermes/config.yaml
# Keep your primary model/provider elsewhere, and let Hermes route cheap/simple work to Chutes.
providers:
chutes:
name: Chutes
base_url: https://llm.chutes.ai/v1
key_env: CHUTES_API_KEY
transport: chat_completions
default_model: default:latency
discover_models: true
models:
default: {}
"default:latency": {}
"default:throughput": {}
smart_model_routing:
enabled: true
cheap_model:
provider: custom:chutes
model: default:latency
Troubleshooting
Symptom
Hermes cannot resolve default:latency
Likely cause
The account has no saved default routing pool.
Fix
Use a concrete model ID or configure Model Routing once in the dashboard.
Symptom
Completions hit anonymous rate limits
Likely cause
The key is missing or sent as X-API-Key.
Fix
Set CHUTES_API_KEY and send Bearer auth through the provider.
Symptom
Feature-specific prompt fails
Likely cause
The selected model may not expose that feature.
Fix
Read supported_features from /v1/models before selecting the model.