Cord Decorator API Reference
The decorator is used to create HTTP API endpoints in Chutes applications. Cords are the primary way to expose functionality from your chute. This reference covers all parameters, patterns, and best practices.
Decorator Signature
@chute.cord(
path: str = None,
passthrough_path: str = None,
passthrough: bool = False,
passthrough_port: int = None,
public_api_path: str = None,
public_api_method: str = "POST",
stream: bool = False,
provision_timeout: int = 180,
input_schema: Optional[Any] = None,
minimal_input_schema: Optional[Any] = None,
output_content_type: Optional[str] = None,
output_schema: Optional[Dict] = None,
**session_kwargs
)Parameters
The URL path where the endpoint will be accessible via the public API.
Format Rules:
- Must start with
- Must match pattern
- Can include path parameters with syntax
- Case-sensitive
Examples:
# Simple path
@chute.cord(public_api_path="/generate")
# Path with parameter
@chute.cord(public_api_path="/users/{user_id}")
# Nested resource
@chute.cord(public_api_path="/models/{model_id}/generate")The HTTP method for the public API endpoint.
Supported Methods:
- - Retrieve data
- - Create or process data (default)
- - Update existing data
- - Remove data
- - Partial updates
Examples:
# GET for data retrieval
@chute.cord(public_api_path="/models", public_api_method="GET")
async def list_models(self):
return {"models": ["gpt-3.5", "gpt-4"]}
# POST for data processing (default)
@chute.cord(public_api_path="/generate", public_api_method="POST")
async def generate_text(self, prompt: str):
return await self.model.generate(prompt)
# DELETE for removal
@chute.cord(public_api_path="/cache", public_api_method="DELETE")
async def clear_cache(self):
self.cache.clear()
return {"status": "cache cleared"}Internal path for the endpoint. Defaults to the function name if not specified.
Enable streaming responses for real-time data transmission.
When to Use Streaming:
- Long-running text generation
- Real-time progress updates
- Token-by-token LLM output
- Large data processing
Streaming Example:
from fastapi.responses import StreamingResponse
import json
@chute.cord(
public_api_path="/stream_generate",
public_api_method="POST",
stream=True
)
async def stream_text_generation(self, prompt: str):
async def generate_stream():
async for token in self.model.stream_generate(prompt):
data = {"token": token, "finished": False}
yield f"data: {json.dumps(data)}\n\n"
# Send completion signal
yield f"data: {json.dumps({'token': '', 'finished': True})}\n\n"
return StreamingResponse(
generate_stream(),
media_type="text/event-stream"
)Pydantic model for input validation and documentation.
Benefits:
- Automatic input validation
- Auto-generated API documentation
- Type safety
- Error handling
Example:
from pydantic import BaseModel, Field
class TextGenerationInput(BaseModel):
prompt: str = Field(..., description="Text prompt for generation")
max_tokens: int = Field(100, ge=1, le=2000, description="Maximum tokens")
temperature: float = Field(0.7, ge=0.0, le=2.0, description="Sampling temperature")
@chute.cord(
public_api_path="/generate",
public_api_method="POST",
input_schema=TextGenerationInput
)
async def generate_text(self, input_data: TextGenerationInput):
return await self.model.generate(
input_data.prompt,
max_tokens=input_data.max_tokens,
temperature=input_data.temperature
)Simplified schema for basic API documentation and testing. Useful when you have complex input but want simpler examples.
Example:
class FullInput(BaseModel):
prompt: str
max_tokens: int = 100
temperature: float = 0.7
top_p: float = 0.9
frequency_penalty: float = 0.0
class SimpleInput(BaseModel):
prompt: str = Field(..., description="Just the prompt for quick testing")
@chute.cord(
public_api_path="/generate",
input_schema=FullInput,
minimal_input_schema=SimpleInput
)
async def generate_flexible(self, input_data: FullInput):
return await self.model.generate(**input_data.dict())The MIME type of the response content. Auto-detected for JSON/text, but should be specified for binary responses.
Common Content Types:
- - JSON responses (auto-detected)
- - Plain text (auto-detected)
- , - Images
- , - Audio files
- - Server-sent events
Image Response Example:
from fastapi import Response
@chute.cord(
public_api_path="/generate_image",
public_api_method="POST",
output_content_type="image/png"
)
async def generate_image(self, prompt: str) -> Response:
image_data = await self.image_model.generate(prompt)
return Response(
content=image_data,
media_type="image/png",
headers={"Content-Disposition": "inline; filename=generated.png"}
)Audio Response Example:
@chute.cord(
public_api_path="/text_to_speech",
public_api_method="POST",
output_content_type="audio/wav"
)
async def text_to_speech(self, text: str) -> Response:
audio_data = await self.tts_model.synthesize(text)
return Response(
content=audio_data,
media_type="audio/wav"
)Schema for output validation and documentation. Auto-extracted from return type hints.
Enable passthrough mode to forward requests to an underlying service.
Use Case: When you're running a service like vLLM or SGLang that has its own HTTP server, you can use passthrough to forward requests.
Example:
@chute.cord(
public_api_path="/v1/completions",
public_api_method="POST",
passthrough=True,
passthrough_path="/v1/completions",
passthrough_port=8000
)
async def completions(self, **kwargs):
# Request is forwarded to localhost:8000/v1/completions
passThe path to forward requests to when using passthrough mode.
The port to forward requests to when using passthrough mode. Defaults to 8000.
Timeout in seconds for waiting for the chute to provision. Default is 3 minutes.
Function Patterns
Simple Functions
# Basic function with primitive parameters
@chute.cord(public_api_path="/simple")
async def simple_endpoint(self, text: str, number: int = 10):
return {"text": text, "number": number}
# Function with optional parameters
@chute.cord(public_api_path="/optional")
async def optional_params(
self,
required_param: str,
optional_param: str = None,
default_param: int = 100
):
return {
"required": required_param,
"optional": optional_param,
"default": default_param
}Schema-Based Functions
from pydantic import BaseModel
class MyInput(BaseModel):
text: str
count: int = 1
class MyOutput(BaseModel):
results: list[str]
@chute.cord(
public_api_path="/process",
input_schema=MyInput,
output_schema=MyOutput
)
async def process_with_schemas(self, data: MyInput) -> MyOutput:
results = [data.text] * data.count
return MyOutput(results=results)File Responses
from fastapi.responses import FileResponse
@chute.cord(
public_api_path="/download",
public_api_method="GET",
output_content_type="application/pdf"
)
async def download_file(self) -> FileResponse:
return FileResponse(
"report.pdf",
media_type="application/pdf",
filename="report.pdf"
)Error Handling
from fastapi import HTTPException
@chute.cord(public_api_path="/generate")
async def generate_with_errors(self, prompt: str):
# Validate input
if not prompt.strip():
raise HTTPException(
status_code=400,
detail="Prompt cannot be empty"
)
if len(prompt) > 10000:
raise HTTPException(
status_code=400,
detail="Prompt too long (max 10,000 characters)"
)
try:
result = await self.model.generate(prompt)
return {"generated_text": result}
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Generation failed: {str(e)}"
)Complete Example
from chutes.chute import Chute, NodeSelector
from chutes.image import Image
from pydantic import BaseModel, Field
from fastapi import HTTPException
from fastapi.responses import StreamingResponse
import json
image = (
Image(username="myuser", name="text-gen", tag="1.0")
.from_base("parachutes/python:3.12")
.run_command("pip install transformers torch")
)
chute = Chute(
username="myuser",
name="text-generator",
image=image,
node_selector=NodeSelector(gpu_count=1, min_vram_gb_per_gpu=16),
concurrency=4
)
class GenerationInput(BaseModel):
prompt: str = Field(..., min_length=1, max_length=10000)
max_tokens: int = Field(100, ge=1, le=2000)
temperature: float = Field(0.7, ge=0.0, le=2.0)
class SimpleInput(BaseModel):
prompt: str
@chute.on_startup()
async def load_model(self):
from transformers import pipeline
self.generator = pipeline("text-generation", model="gpt2", device=0)
@chute.cord(
public_api_path="/generate",
public_api_method="POST",
input_schema=GenerationInput,
minimal_input_schema=SimpleInput
)
async def generate(self, params: GenerationInput) -> dict:
"""Generate text from a prompt."""
result = self.generator(
params.prompt,
max_length=params.max_tokens,
temperature=params.temperature
)[0]["generated_text"]
return {
"generated_text": result,
"tokens_used": len(result.split())
}
@chute.cord(
public_api_path="/stream",
public_api_method="POST",
stream=True
)
async def stream_generate(self, prompt: str):
"""Stream text generation token by token."""
async def generate():
# Simulated streaming
words = prompt.split()
for word in words:
yield f"data: {json.dumps({'token': word + ' '})}\n\n"
yield f"data: {json.dumps({'finished': True})}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")
@chute.cord(public_api_path="/health", public_api_method="GET")
async def health(self) -> dict:
"""Health check endpoint."""
return {
"status": "healthy",
"model_loaded": hasattr(self, "generator")
}Best Practices
1. Use Descriptive Paths
# Good
@chute.cord(public_api_path="/generate_text")
@chute.cord(public_api_path="/analyze_sentiment")
# Avoid
@chute.cord(public_api_path="/api")
@chute.cord(public_api_path="/do")2. Choose Appropriate Methods
# GET for read-only operations
@chute.cord(public_api_path="/models", public_api_method="GET")
# POST for AI generation/processing
@chute.cord(public_api_path="/generate", public_api_method="POST")3. Use Input Schemas for Validation
from pydantic import BaseModel, Field
class ValidatedInput(BaseModel):
prompt: str = Field(..., min_length=1, max_length=10000)
temperature: float = Field(0.7, ge=0.0, le=2.0)
@chute.cord(public_api_path="/generate", input_schema=ValidatedInput)
async def generate(self, params: ValidatedInput):
# Input is automatically validated
pass4. Handle Errors Gracefully
@chute.cord(public_api_path="/generate")
async def generate(self, prompt: str):
if not prompt.strip():
raise HTTPException(400, "Prompt cannot be empty")
try:
return await self.model.generate(prompt)
except Exception as e:
raise HTTPException(500, f"Generation failed: {e}")5. Use Streaming for Long Operations
@chute.cord(public_api_path="/generate", stream=True)
async def stream_generate(self, prompt: str):
async def stream():
async for token in self.model.stream(prompt):
yield f"data: {json.dumps({'token': token})}\n\n"
return StreamingResponse(stream(), media_type="text/event-stream")See Also
- Chute Class - Main chute documentation
- Job Decorator - Background job documentation
- Streaming Guide - Detailed streaming patterns