Native fetch. Zero runtime dependencies. We refused to ship 5MB of axios bundlers. The TypeScript SDK is <250KB minified. The Python SDK is <150KB on disk.
npm
@toolkit-llm/voice
v0.1.0-beta.1 · TypeScript · JavaScript · Deno · Bun
npm install @toolkit-llm/voice
PyPI
toolkit-voice
v0.1.0b1 · Python 3.10+ · asyncio native
pip install toolkit-voice
TypeScript / JavaScript
Install:
bash
# npm
npm install @toolkit-llm/voice
# or pnpm / yarn / bun
pnpm add @toolkit-llm/voice
yarn add @toolkit-llm/voice
bun add @toolkit-llm/voice
Realtime session:
TypeScript
import { Toolkit } from "@toolkit-llm/voice";
const client = new Toolkit({
apiKey: process.env.TOOLKIT_API_KEY,
});
const session = await client.realtime.sessions.create({
model: "toolkit-voice",
instructions: "You are a helpful auto-dealer agent named Tony.",
voice: "VARM1",
modalities: ["text", "audio"],
});
session.on("response.audio.delta", (event) => {
// event.delta is base64 PCM16 (or opus if you set audio_format)
speakers.write(event.delta);
});
session.on("conversation.item.input_audio_transcription.completed", (event) => {
console.log("Caller said:", event.transcript);
});
session.appendAudio(micChunk);
Outbound dial:
TypeScript
const call = await client.calls.create({
to: "+14155551234",
from: "+14152278669", // Your provisioned number
persona: "tony",
max_seconds: 60,
});
// Stream events as the call progresses
for await (const event of call.events()) {
if (event.type === "call.ended") break;
}
Webhook verification:
TypeScript
import { verifyWebhook } from "@toolkit-llm/voice/webhooks";
// Express
app.post("/voice/event", express.raw({ type: "*/*" }), (req, res) => {
const event = verifyWebhook({
body: req.body, // raw bytes
signature: req.header("X-Toolkit-Signature"),
secret: process.env.TOOLKIT_WEBHOOK_SECRET,
tolerance_s: 300, // 5min replay window
});
// event is now a typed Event union
});
Python
Install:
bash
pip install toolkit-voice
# Or with uv (faster):
uv pip install toolkit-voice
Realtime session (asyncio):
Python
import asyncio
from toolkit import Toolkit
async def main():
client = Toolkit()
session = await client.realtime.sessions.create(
model="toolkit-voice",
instructions="You are a helpful auto-dealer agent named Tony.",
voice="VARM1",
)
@session.on("response.audio.delta")
def on_audio(event):
# event.delta is base64 PCM16
speakers.write(event.delta)
@session.on("conversation.item.input_audio_transcription.completed")
def on_transcript(event):
print(f"Caller said: {event.transcript}")
await session.append_audio(mic_chunk)
asyncio.run(main())
Outbound dial:
Python
call = client.calls.create(
to="+14155551234",
from_="+14152278669",
persona="tony",
max_seconds=60,
)
async for event in call.events():
if event.type == "call.ended":
break