Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience

Announcing TEE: Trusted Execution Environments - Now Publicly Available

NEW Chutes Search - AI-Powered Search is Live! NEW fictio - Create your own experience
Pricing

Per-token rates and TEE GPU deployments, priced in the open.

No middlemen. No markup tiers. Pay for the tokens you use, or deploy your own chute on a confidential GPU and pay by the second.

chutes ~ pricing live
Public inference · pay per 1M tokens
$ curl https://llm.chutes.ai/v1/chat/completions \
-H "Authorization: Bearer $CHUTES_KEY" \
-d '{"model":"google/gemma-4-31B-turbo-TEE", ...}'
12,847 tokens · $0.0023 charged · 79 live instances
Private deployment · pay per second
$ chutes deploy my_chute:chute --accept-fee
GPU rates from $1.80/hr · deploy fee starts at $5.40

Model pricing

Pay per token. No subscription, no minimum, no markup. All featured models run on confidential TEE compute.

Estimate your cost

Pick a workload. Prices update below.

Pay from a Bittensor wallet in TAO

Mistral-Nemo-Instruct-2407
unsloth/Mistral-Nemo-Instruct-2407-TEE
Input
$0.0245
per 1M tokens
Output
$0.0978
per 1M tokens
Context
-
Est. cost
$0.0031
for your workload
Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-Coder-32B-Instruct-TEE
Input
$0.0245
per 1M tokens
Output
$0.0978
per 1M tokens
Context
-
Est. cost
$0.0031
for your workload
Qwen3-32B
Qwen/Qwen3-32B-TEE
Input
$0.08
per 1M tokens
Output
$0.24
per 1M tokens
Context
41K
Est. cost
$0.0093
for your workload
gemma-4-31B-turbo
google/gemma-4-31B-turbo-TEE
Input
$0.13
per 1M tokens
Output
$0.38
per 1M tokens
Context
131K
Est. cost
$0.015
for your workload
MiniMax M2.5
MiniMaxAI/MiniMax-M2.5-TEE
Input
$0.15
per 1M tokens
Output
$1.20
per 1M tokens
Context
197K
Est. cost
$0.0264
for your workload
DeepSeek-V3.2
deepseek-ai/DeepSeek-V3.2-TEE
Input
$0.28
per 1M tokens
Output
$0.42
per 1M tokens
Context
131K
Est. cost
$0.0274
for your workload
Kimi K2.5
moonshotai/Kimi-K2.5-TEE
Input
$0.44
per 1M tokens
Output
$2.00
per 1M tokens
Context
262K
Est. cost
$0.0592
for your workload
Qwen 3.5
Qwen/Qwen3.5-397B-A17B-TEE
Input
$0.39
per 1M tokens
Output
$2.34
per 1M tokens
Context
262K
Est. cost
$0.0593
for your workload
Qwen3.6-27B
Qwen/Qwen3.6-27B-TEE
Input
$0.50
per 1M tokens
Output
$2.00
per 1M tokens
Context
262K
Est. cost
$0.064
for your workload
Kimi K2.6
moonshotai/Kimi-K2.6-TEE
Input
$0.74
per 1M tokens
Output
$3.50
per 1M tokens
Context
262K
Est. cost
$0.1012
for your workload
GLM-5
zai-org/GLM-5-TEE
Input
$0.95
per 1M tokens
Output
$2.55
per 1M tokens
Context
203K
Est. cost
$0.1066
for your workload
GLM-5.1
zai-org/GLM-5.1-TEE
Input
$1.05
per 1M tokens
Output
$3.50
per 1M tokens
Context
203K
Est. cost
$0.126
for your workload

Looking for something else? Browse the full model catalog in the app.

Browse all models
Private Chutes

Deploy your own private chute

Run your own dedicated AI workload on verified self-serve confidential GPU capacity. Your code, your weights, your data, isolated end-to-end. Deployed in minutes from the chutes CLI. Use this when you need a private model, a custom fine-tune, or a dedicated instance for production traffic.

How it works

  • Build your image, define your chute

    Use the CLI to build a container, declare your NodeSelector with the GPU class, VRAM, and count your workload needs, then expose your endpoints via cords.

  • Deploy with one command

    chutes deploy registers your chute, pays the one-time fee, and brings it online as private by default.

  • Pay only while it runs

    Billed by the second at the GPU's hourly rate. Idle instances shut down automatically, so no GPU-seconds are wasted.

terminal
deploy docs
# In your chute definition:
# choose the self-serve private GPU class
# tee=True, node_selector=NodeSelector(gpu_count=1, include=["pro_6000"])

chutes build my_chute:chute --wait
chutes deploy my_chute:chute --accept-fee
Pricing

Private Chutes are billed at the GPU's hourly rate for however long the instance runs, plus a one-time deployment fee equal to 3× the hourly rate at the time of deployment. No subscription required.

Self-serve TEE GPU
1 verified class available
rate
RTX Pro 6000
96 GB · Blackwell · efficient private deployments
$1.80 / hour
One-time deployment fee
$5.40 = 3× hourly rate

GPUs shown here must have live pricing and TEE measurement support. The deployment fee is paid once when you run chutes deploy; after that, you pay only the per-second hourly rate while the instance is up. Actual placement follows your NodeSelector and available capacity.

Optional monthly plans

Prefer a monthly bill?

Plus and Pro bundle a daily request quota with a discount on per-token rates, in one predictable monthly payment. Pick a plan if you'd rather budget a fixed amount each month than top up your balance as you go.

$10 / mo
Plus
  • Bundled daily quota
  • 6% off PAYG rates beyond the quota
Get Plus
$20 / mo
Pro
  • Larger daily quota
  • 10% off PAYG rates beyond the quota
Get Pro
Enterprise Volume discounts, custom rate limits, dedicated support.
Talk to sales

Frequently Asked Questions

Have questions about our pricing? We've got answers.

Still have questions?