Where do model IDs and prices come from?

Render them from GET https://llm.chutes.ai/v1/models; examples can go stale.

Can default:latency be used without setup?

No. Saved default aliases require a dashboard Model Routing pool. Inline pools and concrete live model IDs are zero setup.

Source draft Connect quickstarts

Catalog checked Jun 25, 2026

Hermes + Chutes recipes

Provider setup, delegation, MCP tools, low-risk routing, privacy checks, and live model selection for Hermes.

Make Chutes the active Hermes backend

Source-copied config. Replace default:* with a concrete live model ID until a dashboard routing pool exists.

Source

Make Chutes the active Hermes backend

yaml

# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-basic.yaml
# ~/.hermes/config.yaml
# Chutes as a named Hermes provider entry.
# Put CHUTES_API_KEY=*** in ~/.hermes/.env, not in this file.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:latency
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

model:
  provider: custom:chutes
  default: default:latency

Use Chutes as a delegation lane

Send summarization, build-log analysis, and lower-risk subtasks to Chutes.

Source

Use Chutes as a delegation lane

yaml

# Source: chutes-agent-toolkit/other-agents/hermes/config-examples/chutes-delegation.yaml
# ~/.hermes/config.yaml
# Use your normal primary model/provider for orchestration, and Chutes for delegated/background subtasks.

providers:
  chutes:
    name: Chutes
    base_url: https://llm.chutes.ai/v1
    key_env: CHUTES_API_KEY
    transport: chat_completions
    default_model: default:throughput
    discover_models: true
    models:
      default: {}
      "default:latency": {}
      "default:throughput": {}

delegation:
  provider: custom:chutes
  model: default:throughput
  reasoning_effort: medium

Add Chutes tools through MCP

Expose model, quota, usage, alias, chute, and API-key read tools to Hermes.

Source

Add Chutes tools through MCP

bash

uv tool install chutes-mcp-server \
  --from plugins/chutes-ai/skills/chutes-mcp-portability/mcp-server

hermes mcp add chutes --command chutes-mcp-server --env CHUTES_API_KEY=${CHUTES_API_KEY}
hermes mcp test chutes
hermes mcp list

Pick a model from live metadata

Use /v1/models for features, modalities, context, pricing, and confidential_compute.

Source

Pick a model from live metadata

bash

curl https://llm.chutes.ai/v1/models

# Current page example from the live server render:
CHUTES_MODEL="Qwen/Qwen3-32B-TEE"

Which lane?

The routing pool below is computed from the live catalog for agentic work. Use it inline for zero setup, or configure a saved pool before using default:* aliases.

Concrete model

Qwen/Qwen3-32B-TEE

Use when

You need a specific context window, feature set, modality, or price.

Inline pool

Qwen/Qwen3-32B-TEE,google/gemma-4-31B-turbo-TEE,MiniMaxAI/MiniMax-M2.5-TEE:latency

Use when

You want failover or latency selection without saved alias setup.