Chutes
Product
Resources
Company
Pricing
Explore Models
Models
AudioDojo

AudioDojo

LLMs.txt
HotPublicTEESpeech

One chute, 12 models, 13 endpoints — covering text-to-speech, voice cloning, voice design, transcription, denoising, source separation, speaker verification, VAD, and language detection. Including Kokoro-82M, Qwen3-TTS 1.7B, Whisper large-v3-turbo, NVIDIA Canary-Qwen 2.5B, and NVIDIA Parakeet TDT.

22.12Kinvocations
2active instances
$1.80/hr
pro_6000
byvonkaiser
AudioDojo playground

AudioDojo bundles speech generation, voice cloning, transcription, speaker analysis, denoising, and separation. Pick a task and the input form adapts to it.

Ultra-fast text-to-speech with built-in multilingual Kokoro voices.

Sign in to use the playground

Run this model with your Chutes account, quota, and API access.