Breakthrough Serverless Compute for AI, at Scale.

Powering Trillions of Tokens per Month, Chutes is the leading open-source, decentralized compute provider for deploying, scaling and running open-source models in production.

Chutes GlobalBittensor
Rotating cube preview

SOTA Open-Source LLMs, Available here first.

Our team works around the clock to provide the latest SOTA Open-Source models minutes after release. When a new model releases, Chutes is where you'll always get them. Get access to what's next here first.

View Top LLMs

How to use

Python

import os
import requests

payload = {
    "model": "moonshotai/Kimi-K2.6-TEE",
    "messages": [{"role": "user", "content": "Ship a concise launch checklist."}],
    "stream": True,
}

response = requests.post(
    "https://llm.chutes.ai/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {os.environ['CHUTES_API_KEY']}",
        "Content-Type": "application/json",
    },
    json=payload,
    stream=True,
)

for line in response.iter_lines():
    if line and line.startswith(b"data: "):
        print(line.decode()[6:])

There's a Chute for Everything

Not just the LLMs you'd expect — Chutes runs image, video, speech, music and more. Every open-source modality, always on and ready to scale.

Chutes Serverless Compute

Made for hyperscaling AI powered products

High-performance AI Inference of top SOTA OSS Models, ephemeral jobs, batch processing jobs, and much more. With Chutes, just bring the code and let us do the rest.

Get Started in seconds
Purpose-built for AI Developers

Purpose-built for AI Developers

Designed to be flexible, but fast

Designed to be flexible, but fast

Decentralized, Open-source Infrastructure

Decentralized, Open-source Infrastructure

AI Model Inference

Permanently Hot Models, Stable, Ready for Scale.

TEE/Secure Compute

Secure, Private, and Isolated Compute.

Consumer Apps

Chutes Chat and Chutes Search for Consumers.

Pricing

Choose a plan that fits your needs.

Plus

$10per month
  • 5X the value of pay-as-you-go
  • 6% off PAYG pricing
  • PAYG requests beyond limit
Best Value

Pro

$20per month
  • 5X the value of pay-as-you-go
  • 10% off PAYG pricing
  • PAYG requests beyond limit

Enterprise

Contact us

Custom billing only

  • Volume discounts
  • Custom rate limits
  • Dedicated support

Don't want a subscription? With pay-as-you-go you only pay for the LLM tokens you actually consume — no monthly commitment. See per-token model rates, a live cost estimator, and private GPU pricing on the pricing page.

Explore pay-as-you-go pricing
Chutes Explore

Top Models Available Always On, Ready to Scale.

Bring your own code. On Chutes, you can run any model without worrying about cold starts or capacity constraints.

AI Compute for Everyone.