Unit Testing for the
Voice AI Era

Simulate thousands of concurrent calls. Detect hallucinations instantly. Score latency down to the millisecond. Works with any voice agent.

Get StartedSee Features
$ pip install git+https://github.com/unforkopensource-org/decibench.git
10
Built-in Evaluators
8
Connectors Shipped
3
Testing Modes
0
Telemetry Calls

Write YAML. Run tests. Ship.

Decibench synthesizes caller audio, calls your agent over any protocol, transcribes the response, and scores it across 10 metrics — all from one command.

# 1. Install
$ pip install git+https://github.com/unforkopensource-org/decibench.git

# 2. Test the built-in demo agent (zero config)
$ decibench run target=demo suite=quick

# 3. Test YOUR agent
$ decibench run target=ws://localhost:8080/ws suite=standard

# 4. View results in the dashboard
$ decibench serve
✓ Dashboard running at http://localhost:8100

Three levels of depth.

Deterministic

Exact string matching, regex, keyword checks. Sub-millisecond. Runs entirely locally with zero API costs.

FREE · ~ms per test

Semantic

LLM-as-Judge scores accuracy, compliance, and hallucination rates. Works with GPT-4o, Claude, Gemini, or Ollama.

~$0.01/call · ~2s per test

RAG-Augmented

Upload your knowledge base. Decibench auto-generates adversarial test suites that actively try to break your agent.

~$0.03/call · ~5s per test

10 metrics. Every call.

Every call is automatically scored across all applicable metrics. No configuration needed.

Latency
p50 / p90 / p95 / TTFB
WER / CER
Word & character error rates
Hallucination
LLM-graded factual accuracy
Task Completion
Did the agent achieve the goal?
Compliance
Mandatory disclosures & disclaimers
Interruption
Barge-in handling robustness
Silence
Dead air detection
MOS / STOI
Audio quality & intelligibility
Composite Score
Weighted aggregate — single number

Works with your stack.

No SDK to install in your agent. Decibench connects to your agent — not the other way around.

WebSocket
ws://
ElevenLabs
elevenlabs://
Twilio Mock
twilio://
HTTP
http://
Process
exec:"…"
Vapi
vapi://
🧪
Retell
retell://
🧪
LiveKit
📋

Stop testing manually.
Start shipping with confidence.

Decibench is free, open source, and ready for production.

View on GitHub Back to Unfork