Deploying Chutes
The command takes your built images and deploys them as live, scalable AI applications on the Chutes platform.
Basic Deploy Command
Deploy a chute to the platform.
chutes deploy <chute_ref> [OPTIONS]Arguments:
- : Chute reference in format
Options:
- : Custom config path
- : Path to logo image for the chute
- : Enable debug logging
- : Mark chute as public/available to anyone
- : Acknowledge and accept the deployment fee
Deployment Examples
Basic Deployment
# Deploy with fee acknowledgment
chutes deploy my_chute:chute --accept-feeWhat happens:
- ✅ Validates image exists and is built
- ✅ Creates deployment configuration
- ✅ Registers chute with the platform
- ✅ Returns chute ID and version
Production Deployment
# Deploy with logo
chutes deploy my_chute:chute \
--logo ./assets/logo.png \
--accept-feePrivate vs Public Deployments
# Private deployment (default) - only you can access
chutes deploy my_chute:chute --accept-fee
# Public deployment (requires special permissions)
chutes deploy my_chute:chute --public --accept-feeNote: Public chutes require special permissions. If you need to share your chute, use the command instead.
Deployment Process
Deployment Stages
# Example deployment output
Deploying chute: my_chute:chute
You are about to upload my_chute.py and deploy my-chute, confirm? (y/n) y
Successfully deployed chute my-chute chute_id=abc123 version=1What Gets Deployed
When you deploy, the following is sent to the platform:
- Chute Configuration: Name, readme, tagline
- Node Selector: GPU requirements
- Cords: API endpoints your chute exposes
- Code Reference: Your chute's Python code
- Image Reference: The built image to use
Deployment Fees
Deployment incurs a one-time fee based on your NodeSelector configuration:
# Deploy and acknowledge the fee
chutes deploy my_chute:chute --accept-feeIf you don't include , you may receive a 402 error indicating the deployment fee needs to be acknowledged.
Fee Structure
Deployment fees are calculated based on:
- GPU Type: Higher-end GPUs cost more
- GPU Count: More GPUs = higher fee
- VRAM Requirements: Higher VRAM requirements cost more
Example fee calculation:
- Single RTX 3090 at $0.12/hr = $0.36 deployment fee
- Multiple GPUs or premium GPUs will have higher fees
Pre-Deployment Checklist
Before deploying, ensure:
1. Image is Built and Ready
# Check image status
chutes images list --name my-image
chutes images get my-image
# Should show status: "built and pushed"2. Chute Configuration is Correct
# Verify your chute definition
from chutes.chute import Chute, NodeSelector
chute = Chute(
username="myuser",
name="my-chute",
tagline="My awesome AI chute",
readme="## My Chute\n\nDescription here...",
image=my_image,
concurrency=4,
node_selector=NodeSelector(
gpu_count=1,
min_vram_gb_per_gpu=16,
),
)3. Cords are Defined
@chute.cord()
async def my_function(self, input_data: str) -> str:
return f"Processed: {input_data}"
@chute.cord(
public_api_path="/generate",
public_api_method="POST",
)
async def generate(self, prompt: str) -> str:
# Your logic here
return resultChute Configuration Options
NodeSelector
Control which GPUs your chute runs on:
from chutes.chute import NodeSelector
node_selector = NodeSelector(
gpu_count=1, # Number of GPUs (1-8)
min_vram_gb_per_gpu=16, # Minimum VRAM per GPU (16-80)
include=["rtx4090"], # Only use these GPU types
exclude=["rtx3090"], # Don't use these GPU types
)Concurrency
Set how many concurrent requests your chute can handle:
chute = Chute(
...
concurrency=4, # Handle 4 concurrent requests per instance
)Auto-Scaling
Configure automatic scaling behavior:
chute = Chute(
...
max_instances=10, # Maximum number of instances
scaling_threshold=0.8, # Scale up threshold
shutdown_after_seconds=300, # Shutdown idle instances after 5 minutes
)Network Egress
Control external network access:
chute = Chute(
...
allow_external_egress=True, # Allow external network access
)Note: By default, is true for custom chutes but false for vllm/sglang templates. Set to if your chute needs to fetch external resources (e.g., image URLs for vision models).
Sharing Chutes
After deployment, you can share your chute with other users:
# Share with another user
chutes share --chute-id my-chute --user-id colleague
# Remove sharing
chutes share --chute-id my-chute --user-id colleague --removeBilling When Sharing
When you share a chute:
- You (chute owner) pay the hourly rate while instances are running
- The user you shared with pays the standard usage rate (per token, per step, etc.)
Troubleshooting Deployments
Common Deployment Issues
"Image is not available to be used (yet)!"
# Image hasn't finished building - check status
chutes images get my-image
# Wait for status: "built and pushed""Unable to create public chutes from non-public images"
# If deploying public chute, image must also be public
# Rebuild image with --public flag
chutes build my_chute:chute --public --wait402 Payment Required
# Include --accept-fee flag
chutes deploy my_chute:chute --accept-fee409 Conflict
# Chute with this name already exists
# Delete existing chute first
chutes chutes delete my-chute
# Or use a different name in your chute definitionDebug Commands
# Enable debug logging
chutes deploy my_chute:chute --debug --accept-fee
# Check existing chutes
chutes chutes list
chutes chutes get my-chute
# Check image status
chutes images get my-imageCI/CD Integration
GitHub Actions
name: Deploy to Chutes
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install Chutes
run: pip install chutes
- name: Configure Chutes
env:
CHUTES_CONFIG: ${{ secrets.CHUTES_CONFIG }}
run: |
mkdir -p ~/.chutes
echo "$CHUTES_CONFIG" > ~/.chutes/config.ini
- name: Build and Deploy
run: |
chutes build my_app:chute --wait
chutes deploy my_app:chute --accept-feeGitLab CI
deploy:
stage: deploy
script:
- pip install chutes
- mkdir -p ~/.chutes
- echo "$CHUTES_CONFIG" > ~/.chutes/config.ini
- chutes build my_app:chute --wait
- chutes deploy my_app:chute --accept-fee
only:
- mainProduction Deployment Checklist
Pre-Deployment
# ✅ Run tests locally
python -m pytest tests/
# ✅ Build image and verify
chutes build my_chute:chute --wait
chutes images get my-chute
# ✅ Test locally if possible
docker run --rm -it -p 8000:8000 my_chute:tag chutes run my_chute:chute --devDeployment
# ✅ Deploy with fee acknowledgment
chutes deploy my_chute:chute --accept-fee
# ✅ Note the chute_id and version from outputPost-Deployment
# ✅ Verify deployment
chutes chutes get my-chute
# ✅ Warm up the chute
chutes warmup my-chute
# ✅ Test the endpoint
curl -X POST https://your-chute-url/your-endpoint \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{"input": "test"}'Best Practices
1. Use Meaningful Names
chute = Chute(
name="sentiment-analyzer-v2", # Clear, versioned name
tagline="Analyze sentiment in text using BERT",
readme="## Sentiment Analyzer\n\n...",
)2. Set Appropriate Concurrency
# For LLMs with continuous batching (vllm/sglang)
concurrency=64
# For single-request models (diffusion, custom)
concurrency=1
# For models with some parallelism
concurrency=43. Configure Shutdown Timer
# For development/testing - short timeout
shutdown_after_seconds=60
# For production - longer timeout to avoid cold starts
shutdown_after_seconds=3004. Right-Size GPU Requirements
# Match your model's actual requirements
NodeSelector(
gpu_count=1,
min_vram_gb_per_gpu=24, # For ~13B parameter models
)
# Don't over-provision
NodeSelector(
gpu_count=1,
min_vram_gb_per_gpu=80, # Only if you actually need A100
)Next Steps
- Managing Resources - Monitor and manage deployments
- Building Images - Optimize your build process
- Account Management - API keys and configuration
- CLI Overview - Return to command overview