Quickstart
HZRelay routes real-time streams between any provider. Voice, tokens, events — one SDK, one session model, zero codec code.
[WHAT YOU GET]
Working AI phone call (Twilio → Deepgram → OpenAI → ElevenLabs → caller) in under 30 minutes. No sample rate config. No reconnect logic. No separate web/phone pipelines.
1.0 HOW IT WORKS
[ PIPELINE_DIAGRAM ]
HZRelay owns every arrow. You configure; we route.
HZRelay sits between your providers. Twilio sends mulaw 8kHz — we transcode to PCM 16kHz before Deepgram sees it. ElevenLabs returns audio — we transcode back to mulaw before Twilio plays it. You never touch codecs.
2.0 PREREQUISITES
3.0 INSTALL SDK
4.0 CREATE SESSION
Describe what you want. HZRelay handles how.
[CODEC_NOTE]
Twilio sends mulaw 8kHz. Deepgram expects PCM 16kHz. HZRelay transcodes at every adapter boundary — you never specify encoding, sample rate, or chunk size.
5.0 CONFIGURE TWILIO
Point your Twilio phone number webhook here. When a call arrives, Twilio opens a Media Stream WebSocket to HZRelay. The session_id links it to your SDK session.
6.0 OPTIONAL: AGENT MODE
Add an agent: block — HZRelay runs transcript → LLM → TTS automatically (Depth 2). No event handlers needed.
7.0 EVENTS REFERENCE
8.0 LATENCY METRICS
Every session records millisecond timestamps per stage. Call session.getMetrics() anytime — or hit the REST endpoint.
→ NEXT STEPS
TWILIO_GUIDE
Full inbound call + TwiML setup.
ELEVENLABS_GUIDE
Barge-in, voice config, latency tuning.
TOKEN_FAN_OUT
Stream LLM output to multiple subscribers.
WEBHOOK_ROUTING
Normalize + fan-out any provider webhook.
AGENT_CONFIG
Full agent: block reference and examples.
LATENCY_TUNING
Provider swap impact and benchmark data.