Changelog

What shipped.
And what's shipping next.

Public roadmap. We commit to dates and ship them. Items past the launch line are scheduled with confidence; in-flight items are this week's work; shipped items are linkable.

Shipped

In-flight (this week)

Scheduled (date locked)

Deferred

2027-02-03scheduled

Per-customer LoRA fine-tunes

Brain-B per-customer LoRA pipeline live
Adapter swapping at session-start without cold-start penalty
Self-serve via /voice/dashboard/finetune
Custom voice cloning available for Enterprise tier (DP-SGD ε=4)

2026-09-07v0.4.0scheduled

Voice cloning + AutoVoice + Show HN #2

Zero-shot voice cloning (3s reference audio)
DP-SGD ε=4 + canary memorization tests
AutoVoice: agent picks register based on caller signal
Image input mid-call (vision)
First FTE hire (CS Lead at $20K MRR threshold)

2026-08-08v0.3.0scheduled

Day 90: Image input + AutoVoice

image_input.py event support (vision)
AutoVoice register-switching
Cartesia Sonic-3 TTS as default
MCP tool bridging GA

2026-07-09v0.2.0scheduled

Day 60: Flagship reveal + outbound

Flagship $0.09/min tier publicly revealed (auto-route on intent)
Outbound dialer enabled (TCPA-gated, 7-day inbound-only minimum lifted)
Pathway YAML primitive (declarative call flows)
Brain-B L0 routing bot (intent classifier)
Custom adapter $99/mo tier
L4 backbone (Soft barge-in canonicalized)

2026-06-09v0.1.0scheduled

Day 30 launch — founding-customer beta opens

$0.04/min standalone voice API live (mini tier)
OpenAI Realtime drop-in: 74/77 events at launch
Phone numbers bundled ($1/mo local, $5/mo vanity)
0-byte audio retention
5 production personas live (Tony, Maria, Mike, Jenny, Sam)
@toolkit-llm/voice + toolkit-voice SDKs published
voicebench-10 OSS leaderboard live (Apache 2.0)
Founding-customer beta opens (first 50, 30% off forever)
Show HN: 'Voice AI at $0.04/min, phone numbers included, OpenAI Realtime drop-in'

2026-05-15in-flight

First production pod stable + 4-pod bench

PersonaPlex Blackwell image baked + GHCR pushed
TTFT p95 <250ms across A6000 / 5090 / PRO 6000 / 4090-failover
voicebench-10 dry-run with 9 providers
First 5 vanity DIDs acquired via Bandwidth

2026-05-12in-flight

Bandwidth KYC + 3-carrier provisioning

Bandwidth account active
Telnyx + SignalWire backup carriers in queue
+1-628-KLAW-NOW shared demo line provisioned
5 vanity DIDs targeted: Tony 415-CAR-TONY, Maria 305-CASA-NOW, Mike 510-BUILD-IT, Jenny 805-PLAY-JOY, Sam 702-NEW-TIRE

2026-05-08shipped

Voice marketing surface — toolkit-llm.com/voice live

/voice lander with embedded dialer + live calculator + 5-line code diff
/voice/start founding-beta waitlist (D1-backed)
/voice/quickstart 5-step real cURL/JS/Python tutorial
/voice/compat full OpenAI Realtime event matrix (74/77 events)
/voice/migrate hub with 4 codemod paths (OpenAI, Vapi, Bland, Twilio)
/voice/voicebench leaderboard (Apache 2.0)
/voice/personas detail pages (5 SSG'd profiles)
/voice/trust security & compliance posture (GDPR Article 9, anti-clone clause)
/voice/faq — 21 honest dev objections answered
/voice/sdks — npm + PyPI install + working samples
4 CF Pages Functions: /api/voice/{try-call, waitlist, health, waitlist-export}
Hardened: SHA-256 phone hashing, geo-block, premium-prefix blocklist, KV rate limits

2026-05-07shipped

OpenAI gpt-realtime-2 drop — competitive analysis complete

Honest pricing pivot: 4-8× cheaper (down from 72%) — wedge survives
H2 updated: '$0.04/min vs OpenAI's $0.16-0.33. We don't make you bring Twilio.'
Compat sprint planned: 67/70 → 74/77 events (1 week)
Briefing context positioning: 'fresh hourly' (counter to OpenAI's 1hr cache)
Audio retention positioning strengthened (we keep 0; OpenAI defaults to 1hr)

2026-05-07v0.0.7shipped

Brain B selection locked: Qwen3-4B-Instruct-2507

Vast 4090 bench: TTFT p95 29ms, 0 leaks, 3.8GB VRAM
Qwen3.5-9B-NVFP4 deferred pending torch 2.7+ image
Qwen3.5-9B-AWQ-INT4 viable on 4090 (24GB)
Production-locked for Day 30

2026-05-01shipped

Architecture: master + dumb-worker pool

2× A6000 RunPod Secure masters (always-on, customer state)
Ephemeral workers self-register across RunPod / Lambda / Crusoe / Latitude / 5090 / PRO 6000 / 4090-failover
CF Cron `gpu-scout` opportunistically scales the pool
3 GHCR image variants: ampere / ada / blackwell

2026-04-15shipped

OpenAI Realtime compat: 67/70 events shipped

Full lifecycle: session.update, conversation.item.*, input_audio_buffer.*, response.*
Function calling (server tools + client tools)
Error handling with rate_limits.updated
MCP tool bridging (toolkit extension)

2026-04-01shipped

Two-brain architecture (Helium + toolkit-chat)

Helium = mouth (audio in/out, low-latency)
toolkit-chat = brain (logic, tools, memory)
6 meta-tools defined; 3-lane response loop
Pre-warmed KV cache: 147ms avg first-token

Want a feature on this list?

Founding-beta customers vote on the roadmap. Your use case shapes what ships next.

Apply for founding beta →

What shipped.And what's shipping next.