No marketing fog. This is the literal feature list of the RHYTHMIX Studio CLI engine that ships today.
Reads duration, sample rate, channel count and codec via ffprobe. Supports MP3, WAV, M4A, FLAC and anything ffmpeg knows.
Samples per-second RMS loudness across the whole track and groups loud/quiet runs into intro / verse / chorus / bridge / outro. Loud passages become chorus, quiet passages become verse.
--flat-plan to disableSection and scene boundaries snap to the nearest beat. Pass --bpm or install aubio for auto-detect.
5+ prompts per role (intro/verse/chorus/bridge/outro), deterministically shuffled so consecutive scenes in the same section never repeat a prompt.
| Mode | Cost | Visual quality | Requires |
|---|---|---|---|
| Replicate (paid) | ~$8–25 / render | AI-generated, theme-matched | REPLICATE_API_TOKEN |
| Pexels (free) | $0 | Real stock footage | PEXELS_API_KEY (free signup) |
| Local (free) | $0 | Whatever clips you supply | Folder of .mp4 files |
| Model | Provider | Best for | Max clip |
|---|---|---|---|
| Kling v2 | Kuaishou | Cinematic chorus shots | 10s |
| Hunyuan Video | Tencent | Motion-heavy verses | 5s |
| Luma Ray | Luma Labs | Dreamy bridges, smooth camera | 5s |
| MiniMax Hailuo | MiniMax | Character / expressive | 6s |
The planner picks a model per section automatically. Override with --model <name>.
16:9 landscape (1280×720), 9:16 portrait (720×1280) for TikTok/Reels, or 1:1 square (1024×1024) for Insta feed. Auto-crop fills the frame regardless of source clip dimensions.
0.25s xfade between every scene with mathematical duration compensation so the final video matches your audio length exactly. --no-transitions for hard cuts.
Each scene fetch retries up to 3× with exponential backoff. Finished scenes are kept on disk; a failed render resumes from the last successful scene instead of re-spending.
See the per-scene model breakdown and estimated cost before any API call fires. Edit plan.json to tweak prompts or models, then render-from-plan.
27 tests covering loudness curve, structure detection, all 3 aspect ratios, crossfade + cut modes. npm test runs the full pipeline on a synthetic track.
Pure Node 20 + ffmpeg. The whole zip is 20 KB. No node_modules, no supply-chain risk, no version drift.
Honors retry_after from rate-limit responses with jitter. Long renders won't get throttled out.
Browser upload + render queue. No Node, no terminal, no env vars. Same engine underneath.
Per-scene prompts generated from theme + section + song mood instead of template recipes.
Whip-pan / dip-to-color transitions that fire on downbeats in chorus.
Section boundaries from spectral novelty, not just loudness — picks up texture changes loudness misses.
30-day money-back guarantee. No subscription.
Buy RHYTHMIX — $149 →