Unit Testing for the
Voice AI Era
Simulate thousands of concurrent calls. Detect hallucinations instantly. Score latency down to the millisecond. Works with any voice agent.
Write YAML. Run tests. Ship.
Decibench synthesizes caller audio, calls your agent over any protocol, transcribes the response, and scores it across 10 metrics — all from one command.
Three levels of depth.
Deterministic
Exact string matching, regex, keyword checks. Sub-millisecond. Runs entirely locally with zero API costs.
FREE · ~ms per testSemantic
LLM-as-Judge scores accuracy, compliance, and hallucination rates. Works with GPT-4o, Claude, Gemini, or Ollama.
~$0.01/call · ~2s per testRAG-Augmented
Upload your knowledge base. Decibench auto-generates adversarial test suites that actively try to break your agent.
~$0.03/call · ~5s per test10 metrics. Every call.
Every call is automatically scored across all applicable metrics. No configuration needed.
Works with your stack.
No SDK to install in your agent. Decibench connects to your agent — not the other way around.
Stop testing manually.
Start shipping with confidence.
Decibench is free, open source, and ready for production.