HOMEDOCSQUICKSTART

[ GETTING_STARTED ]

Quickstart

SDK v0.1 — STABLE

HZRelay routes real-time streams between any provider. Voice, tokens, events — one SDK, one session model, zero codec code.

[WHAT YOU GET]

Working AI phone call (Twilio → Deepgram → OpenAI → ElevenLabs → caller) in under 30 minutes. No sample rate config. No reconnect logic. No separate web/phone pipelines.

1.0 HOW IT WORKS

[ PIPELINE_DIAGRAM ]

INBOUND

TWILIO

NORMALIZE

CODEC

STT

DEEPGRAM

LLM

OPENAI

TTS

ELEVENLABS

OUTBOUND

TWILIO

HZRelay owns every arrow. You configure; we route.

HZRelay sits between your providers. Twilio sends mulaw 8kHz — we transcode to PCM 16kHz before Deepgram sees it. ElevenLabs returns audio — we transcode back to mulaw before Twilio plays it. You never touch codecs.

2.0 PREREQUISITES

TwilioPSTNPhone number + Media Streams enabled

DeepgramSTTAPI key — nova-2 model recommended

OpenAILLMAPI key — gpt-4o-mini works well

ElevenLabsTTSAPI key + voice ID

HZRelayROUTERFree API key — get it below

3.0 INSTALL SDK

terminal

$ npm install @hzrelay/sdk

# or

$ pip install hzrelay

4.0 CREATE SESSION

Describe what you want. HZRelay handles how.

agent.ts — DEPTH 1 (you drive the loop)

import { createSession } from '@hzrelay/sdk'

const session = createSession({

apiKey: 'hz_...',

// ── inbound ──────────────────────────────

inbound: { type: 'twilio' },

// ── providers (your keys, never stored) ──

stt: { provider: 'deepgram', apiKey: process.env.DG_KEY },

llm: { provider: 'openai', apiKey: process.env.OAI_KEY },

tts: { provider: 'elevenlabs', apiKey: process.env.EL_KEY },

outbound: { type: 'twilio' },

});

// pipe events into your own agent logic

session.on('transcript', (e) => myAgent(e.text))

session.on('llm_response', (e) => session.speak(e.text))

[CODEC_NOTE]

Twilio sends mulaw 8kHz. Deepgram expects PCM 16kHz. HZRelay transcodes at every adapter boundary — you never specify encoding, sample rate, or chunk size.

5.0 CONFIGURE TWILIO

Point your Twilio phone number webhook here. When a call arrives, Twilio opens a Media Stream WebSocket to HZRelay. The session_id links it to your SDK session.

Twilio console → Phone Numbers → Voice webhook URL

// A call comes in → set webhook to:

https://your-server.com/twilio/stream?session_id={session.id}

// TwiML response (tells Twilio to open Media Stream)

</Connect>

</Response>

6.0 OPTIONAL: AGENT MODE

Add an agent: block — HZRelay runs transcript → LLM → TTS automatically (Depth 2). No event handlers needed.

agent.ts — DEPTH 2 (we drive the loop)

const session = createSession({

// ... same provider config ...

// add this block ↓

agent: {

systemPrompt: 'You are a helpful scheduling assistant...',

memory: 'sliding-window', // context trimming — automatic

turnDetection: 'silence', // 'semantic' in Phase 3

bargeIn: true, // caller interrupts → TTS flushes

});

// that's it — caller hears AI respond in ~762ms

7.0 EVENTS REFERENCE

EVENTPAYLOADWHEN

session.created{ session_id }Session ready, pipeline active

speech.start{ ts }VAD detects speech began

speech.end{ ts, duration_ms }VAD detects silence threshold hit

transcript.interim{ text, confidence }Streaming partial STT result

transcript.final{ text, latency_ms }Final STT result — triggers LLM

llm.token_start{ latency_ms }First LLM token received

llm.response{ text }Full LLM response complete

tts.audio_start{ latency_ms }First TTS audio frame ready

tts.interruptednullBarge-in flushed TTS buffer

call.started{ stream_sid, call_sid }Twilio call connected

call.ended{ stream_sid }Call terminated

error{ code, provider, msg }Adapter error — check retryable

8.0 LATENCY METRICS

Every session records millisecond timestamps per stage. Call session.getMetrics() anytime — or hit the REST endpoint.

metrics response — GET /voice/metrics?session_id=a3f9b2c1

{

"audio_received_ms": 0,

"stt_start_ms": 12,

"stt_final_ms": 310,

"llm_start_ms": 312,

"llm_first_token_ms": 680,

"tts_start_ms": 685,

"tts_first_audio_ms": 760,

"audio_sent_ms": 762

}

→ NEXT STEPS

[VOICE]

Quickstart

1.0 HOW IT WORKS

2.0 PREREQUISITES

3.0 INSTALL SDK

4.0 CREATE SESSION

5.0 CONFIGURE TWILIO

6.0 OPTIONAL: AGENT MODE

7.0 EVENTS REFERENCE

8.0 LATENCY METRICS

→ NEXT STEPS

TWILIO_GUIDE

ELEVENLABS_GUIDE

TOKEN_FAN_OUT

WEBHOOK_ROUTING

AGENT_CONFIG

LATENCY_TUNING