Developers

NodeSelector API Reference

The class specifies hardware requirements for Chutes deployments. This reference covers all configuration options, GPU types, and best practices for optimal resource allocation.

Class Definition

from chutes.chute import NodeSelector

node_selector = NodeSelector(
    gpu_count: int = 1,
    min_vram_gb_per_gpu: int = 16,
    include: Optional[List[str]] = None,
    exclude: Optional[List[str]] = None
)

Parameters

Number of GPUs required for the deployment.

Constraints: 1-8 GPUs

Examples:

# Single GPU (default)
node_selector = NodeSelector(gpu_count=1)

# Multiple GPUs for large models
node_selector = NodeSelector(gpu_count=4)

# Maximum supported GPUs
node_selector = NodeSelector(gpu_count=8)

Use Cases:

GPU CountUse Case
1Standard AI models (BERT, GPT-2, 7B LLMs)
2-4Larger language models (13B-30B parameters)
4-8Very large models (70B+ parameters)

Minimum VRAM (Video RAM) required per GPU in gigabytes.

Constraints: 16-140 GB

Examples:

# Default minimum (suitable for most models)
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=16
)

# Medium models requiring more VRAM
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24
)

# Large models
node_selector = NodeSelector(
    gpu_count=2,
    min_vram_gb_per_gpu=48
)

# Ultra-large models (H100 80GB required)
node_selector = NodeSelector(
    gpu_count=4,
    min_vram_gb_per_gpu=80
)

VRAM Requirements by Model Size:

Model SizeMin VRAMExample Models
1-3B params16GBDistilBERT, GPT-2
7B params24GBLlama-2-7B, Mistral-7B
13B params32-40GBLlama-2-13B
30B params48GBCodeLlama-34B
70B+ params80GB+Llama-2-70B, DeepSeek-R1

List of GPU types to include in selection. Only these GPU types will be considered.

Examples:

# Only high-end GPUs
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    include=["a100", "h100"]
)

# Cost-effective options
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=48,
    include=["l40", "a6000"]
)

# H100 only for maximum performance
node_selector = NodeSelector(
    gpu_count=2,
    min_vram_gb_per_gpu=80,
    include=["h100"]
)

List of GPU types to exclude from selection.

Examples:

# Avoid older GPUs
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    exclude=["t4"]
)

# Cost optimization - exclude expensive GPUs
node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    exclude=["h100", "a100-80gb"]
)

Available GPU Types

High-Performance GPUs

GPUVRAMNotes
80GBLatest Hopper architecture, best performance
141GBHopper with HBM3e, maximum memory
80GBAmpere, excellent for training/inference
40GBAmpere, high performance tier

Professional GPUs

GPUVRAMNotes
48GBAda Lovelace, good balance of cost/performance
48GBProfessional-grade, good for development
24GBProfessional-grade, medium workloads
16GBEntry professional GPU

Consumer/Entry GPUs

GPUVRAMNotes
24GBConsumer, cost-effective
24GBPrevious gen consumer
24GBGood for smaller models
16GBEntry-level, inference-focused

AMD GPUs

GPUVRAMNotes
192GBAMD Instinct, very high memory

Common Selection Patterns

Cost-Optimized

# Small models - minimize cost
budget_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=16,
    include=["t4", "a4000", "a10"]
)

# Medium models - balance cost/performance
balanced_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    include=["l40", "a5000", "rtx4090"],
    exclude=["h100", "a100-80gb"]
)

Performance-Optimized

# Maximum performance
performance_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=80,
    include=["h100", "a100-80gb"]
)

# High throughput serving
throughput_selector = NodeSelector(
    gpu_count=4,
    min_vram_gb_per_gpu=48,
    include=["l40", "a100"]
)

Model-Specific

# 7B parameter models (e.g., Mistral-7B, Llama-2-7B)
llm_7b_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    include=["l40", "a5000", "rtx4090"]
)

# 13B parameter models
llm_13b_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=40,
    include=["l40", "a100", "a6000"]
)

# 70B parameter models
llm_70b_selector = NodeSelector(
    gpu_count=2,
    min_vram_gb_per_gpu=80,
    include=["h100", "a100-80gb"]
)

# DeepSeek-R1 (671B parameters)
deepseek_selector = NodeSelector(
    gpu_count=8,
    min_vram_gb_per_gpu=141,
    include=["h200"]
)

Image Generation

# Stable Diffusion / SDXL
diffusion_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    include=["l40", "a5000", "rtx4090"]
)

# FLUX models
flux_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=48,
    include=["l40", "a6000", "a100"]
)

Integration Examples

With Chute Definition

from chutes.chute import Chute, NodeSelector
from chutes.image import Image

node_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    include=["l40", "a100"]
)

chute = Chute(
    username="myuser",
    name="my-model-server",
    image=Image(username="myuser", name="my-image", tag="1.0"),
    node_selector=node_selector
)

With Templates

from chutes.chute.template import build_vllm_chute
from chutes.chute import NodeSelector

chute = build_vllm_chute(
    username="myuser",
    model_name="meta-llama/Llama-2-7b-chat-hf",
    node_selector=NodeSelector(
        gpu_count=1,
        min_vram_gb_per_gpu=24,
        include=["l40", "a5000"]
    )
)

Dynamic Selection Based on Model

def get_node_selector(model_size: str) -> NodeSelector:
    """Get appropriate NodeSelector based on model size."""
    
    configs = {
        "small": {  # < 3B parameters
            "gpu_count": 1,
            "min_vram_gb_per_gpu": 16
        },
        "medium": {  # 7-13B parameters
            "gpu_count": 1,
            "min_vram_gb_per_gpu": 32,
            "exclude": ["t4"]
        },
        "large": {  # 30-70B parameters
            "gpu_count": 2,
            "min_vram_gb_per_gpu": 48,
            "include": ["a100", "l40", "h100"]
        },
        "xlarge": {  # 70B+ parameters
            "gpu_count": 4,
            "min_vram_gb_per_gpu": 80,
            "include": ["h100", "a100-80gb"]
        }
    }
    
    return NodeSelector(**configs.get(model_size, configs["medium"]))

Common Issues and Solutions

"No available nodes match your requirements"

Solution 1: Broaden your requirements

# Too restrictive
strict_selector = NodeSelector(
    gpu_count=8,
    min_vram_gb_per_gpu=80,
    include=["h100"]
)

# More flexible
flexible_selector = NodeSelector(
    gpu_count=4,
    min_vram_gb_per_gpu=48,
    include=["h100", "a100", "l40"]
)

Solution 2: Reduce GPU count

# Try multiple smaller GPUs
multi_gpu = NodeSelector(
    gpu_count=2,
    min_vram_gb_per_gpu=40
)

"Out of memory" errors

Increase VRAM requirements:

# Increase min_vram_gb_per_gpu
higher_vram = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=48  # Increased from 24
)

Best Practices

1. Right-Size Your Requirements

Don't over-provision - it wastes resources and costs more:

# Bad - wastes resources for a 7B model
oversized = NodeSelector(
    gpu_count=8,
    min_vram_gb_per_gpu=80
)

# Good - matches actual needs
rightsized = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24
)

2. Use Include/Exclude Wisely

# Be specific when you have known requirements
specific_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=48,
    include=["l40", "a6000"]  # Known compatible GPUs
)

# Exclude known incompatible GPUs
compatible_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=24,
    exclude=["t4"]  # Known to be too slow for your use case
)

3. Development vs Production

# Development - prioritize cost
dev_selector = NodeSelector(
    gpu_count=1,
    min_vram_gb_per_gpu=16,
    include=["t4", "a4000"]
)

# Production - prioritize performance
prod_selector = NodeSelector(
    gpu_count=2,
    min_vram_gb_per_gpu=48,
    include=["l40", "a100"],
    exclude=["t4", "a4000"]
)

Summary

The NodeSelector provides control over GPU hardware selection with four parameters:

ParameterDefaultRangeDescription
11-8Number of GPUs
1616-140Minimum VRAM per GPU
NoneList[str]Whitelist GPU types
NoneList[str]Blacklist GPU types

Start with minimum requirements and adjust based on performance needs and availability.

See Also