SDKs

Two SDKs.
Same shape as OpenAI Realtime.

Native fetch. Zero runtime dependencies. We refused to ship 5MB of axios bundlers. The TypeScript SDK is <250KB minified. The Python SDK is <150KB on disk.

npm
@toolkit-llm/voice
v0.1.0-beta.1 · TypeScript · JavaScript · Deno · Bun
npm install @toolkit-llm/voice
PyPI
toolkit-voice
v0.1.0b1 · Python 3.10+ · asyncio native
pip install toolkit-voice

TypeScript / JavaScript

Install:

bash
# npm
npm install @toolkit-llm/voice

# or pnpm / yarn / bun
pnpm add @toolkit-llm/voice
yarn add @toolkit-llm/voice
bun add @toolkit-llm/voice

Realtime session:

TypeScript
import { Toolkit } from "@toolkit-llm/voice";

const client = new Toolkit({
  apiKey: process.env.TOOLKIT_API_KEY,
});

const session = await client.realtime.sessions.create({
  model: "toolkit-voice",
  instructions: "You are a helpful auto-dealer agent named Tony.",
  voice: "VARM1",
  modalities: ["text", "audio"],
});

session.on("response.audio.delta", (event) => {
  // event.delta is base64 PCM16 (or opus if you set audio_format)
  speakers.write(event.delta);
});

session.on("conversation.item.input_audio_transcription.completed", (event) => {
  console.log("Caller said:", event.transcript);
});

session.appendAudio(micChunk);

Outbound dial:

TypeScript
const call = await client.calls.create({
  to: "+14155551234",
  from: "+14152278669",  // Your provisioned number
  persona: "tony",
  max_seconds: 60,
});

// Stream events as the call progresses
for await (const event of call.events()) {
  if (event.type === "call.ended") break;
}

Webhook verification:

TypeScript
import { verifyWebhook } from "@toolkit-llm/voice/webhooks";

// Express
app.post("/voice/event", express.raw({ type: "*/*" }), (req, res) => {
  const event = verifyWebhook({
    body: req.body,                                  // raw bytes
    signature: req.header("X-Toolkit-Signature"),
    secret: process.env.TOOLKIT_WEBHOOK_SECRET,
    tolerance_s: 300,                                // 5min replay window
  });
  // event is now a typed Event union
});

Python

Install:

bash
pip install toolkit-voice

# Or with uv (faster):
uv pip install toolkit-voice

Realtime session (asyncio):

Python
import asyncio
from toolkit import Toolkit

async def main():
    client = Toolkit()
    session = await client.realtime.sessions.create(
        model="toolkit-voice",
        instructions="You are a helpful auto-dealer agent named Tony.",
        voice="VARM1",
    )

    @session.on("response.audio.delta")
    def on_audio(event):
        # event.delta is base64 PCM16
        speakers.write(event.delta)

    @session.on("conversation.item.input_audio_transcription.completed")
    def on_transcript(event):
        print(f"Caller said: {event.transcript}")

    await session.append_audio(mic_chunk)

asyncio.run(main())

Outbound dial:

Python
call = client.calls.create(
    to="+14155551234",
    from_="+14152278669",
    persona="tony",
    max_seconds=60,
)

async for event in call.events():
    if event.type == "call.ended":
        break

Webhook verification (FastAPI):

Python
from fastapi import FastAPI, Request, HTTPException
from toolkit.webhooks import verify_webhook, InvalidSignature

app = FastAPI()

@app.post("/voice/event")
async def voice_event(request: Request):
    raw_body = await request.body()
    try:
        event = verify_webhook(
            body=raw_body,
            signature=request.headers["X-Toolkit-Signature"],
            secret=os.environ["TOOLKIT_WEBHOOK_SECRET"],
        )
    except InvalidSignature:
        raise HTTPException(401, "Invalid signature")
    return {"ok": True}

What's in the box

  • Realtime sessionsWebSocket-based session creation, lifecycle, event handling — same shape as OpenAI Realtime.
  • Outbound callsDial-out by E.164 number, bind to persona, stream events as the call progresses.
  • Numbers (DIDs)Buy, port-in, bind to persona, hot-swap webhook URL.
  • WebhooksHMAC-SHA256 signature verification with raw-body support, replay-window tolerance.
  • Tool callsSame JSON-schema function-calling spec as OpenAI. Tool result streaming.
  • Auto-retry + backoffExponential backoff on 429/503, idempotency keys for POST endpoints.
  • Typed errors54 typed error subclasses (RateLimitError, AuthError, etc.) — catch what you mean.

What we left out (intentionally)

  • axios / node-fetch / cross-fetchWe use native `fetch`. If your runtime doesn't have it, polyfill at the entry point.
  • EventEmitter polyfillsWe use native `EventTarget`. Works in browser, Node 18+, Deno, Bun.
  • OAuth flowsBearer-token auth only. No refresh logic. No token storage. Stateless.
  • Multi-tenancy primitivesYou bring the user-mapping. We're an API, not a CRM.
  • TypeScript runtime types (zod, io-ts)We ship .d.ts. Validate at the boundary if you want — your call.

Source + issue tracker

Need a different language? founders@toolkit-llm.com — Go and Ruby SDKs are scoped for Day 90 if there's demand.