How many test scenarios has the Safigo plumber receptionist been graded against?

332 scenarios graded by an independent LLM judge, comprising 84 routine scenarios run with three random seeds plus an 80-scenario adversarial pack. Aggregate pass rate is 93.9 percent across more than 1,500 individual rule grades. Zero P0 privacy or safety violations.

Safigo Receipts is the public test-results page for Safigo Reception, the Canadian-built AI receptionist by Safigo. The plumber receptionist has passed alpha gate after 332 graded test scenarios at 93.9 percent pass rate, with zero P0 privacy or safety violations. It is deployed in production on Fly.io as of 2026-05-05. HVAC, roofer, and electrician receptionists are in pilot, not generally available; each will run a full alpha gate before being approved for paying customers. Four real PSTN call recordings with full transcripts are published at safigo.ai/reception/plumber/. We deliberately do not claim 100 percent pass on any trade, production readiness for HVAC, roofer, or electrician, Quebec service, or PHI compliance.

The headline numbers, and what we deliberately do not claim

Receipts. Real test results, on the record.

Q: Has Safigo's HVAC, roofer, or electrician receptionist passed an alpha gate?

Not yet. Only the plumber has passed alpha gate as of 2026-05-05 and is the only trade deployed in production. HVAC, roofer, and electrician receptionists are in pilot, not generally available. Each will run a full alpha gate before being approved for paying customers.

Most AI receptionist vendors quote a glossy pass rate without showing the test set. We publish the headline numbers from our graded test runs, the things we deliberately do not claim, and links to the real PSTN call audio. Updated as new trades pass alpha gate.

Last updated 2026-05-10

These results are from Safigo Reception, our done-for-you AI answering service for plumbers and trades. The product itself, with pricing and setup details, lives on the product page. The page you are reading is the public record of how it actually performs against graded test scenarios. If you want the AI receptionist for plumbers specifically, it has its own page with real call audio.

Plumber alpha gate

The production plumber receptionist was graded against 84 routine scenarios run at three random seeds, plus an 80-scenario adversarial pack. Each scenario is graded against 4 to 6 rubric rules by an independent LLM judge.

332

Scenarios graded

93.9%

Aggregate pass rate

P0 privacy or safety violations

The adversarial pack covers owner-name leaks, social-engineering attempts, false-promise extraction, language-switching attacks, and emergency handling under pressure. The plumber receptionist met our internal adversarial bar across the pack.

Status: Approved for first paying customer pilot 2026-05-04. Deployed to Fly.io safigo-reception (sjc) on 2026-05-05.

Other trades

HVAC, roofer, and electrician receptionists are in pilot, not generally available. None is approved for paying customers yet. Each will run a full alpha gate before deployment.

Real PSTN call audio

Four production-grade calls with full transcripts are published on the plumber product page. Listen on safigo.ai/reception/plumber/. Marked up with AudioObject schema so AI search engines can find and quote them. Scenarios covered: emergency leak, after-hours triage, out-of-area routing, second opinion.

What we deliberately do not claim

We do not claim 100 percent pass on any trade. Plumber is at 93.9 percent. We will publish the next gate result rather than hide it.
We do not claim production readiness for HVAC, roofer, or electrician. Smoke evals are a fast triage signal, not an alpha gate. No trade is approved for paying customers without an alpha gate run.
We do not claim Quebec service. The multilingual claim covers eleven languages including French for non-Quebec callers (Acadian, Franco-Ontarian, Franco-Manitoban, Franco-Albertan).
We do not claim PHI compliance. Healthcare receptionists are out of scope.
We do not publish unit economics or internal cost data. That is between us and our customers.

Methodology

Each scenario is graded by an independent LLM judge against a rubric of rules tiered as P0 (critical, e.g. no false bookings, no owner-name leak, correct emergency triage) and high-priority. The plumber alpha gate scored zero P0 violations. See our methodology page for how we calculate the underlying customer-impact statistics that the rubric is grounded in.

Update log

2026-05-10. Page refreshed. Internal eval details and unit economics removed from public view; headline result and "do not claim" section retained.
2026-05-05 morning. Plumber receptionist deployed to Fly.io.
2026-05-04 night. Plumber alpha gate run. 332 scenarios, 93.9 percent pass, zero P0 violations. Approved for paying customer pilot.

Why publish this?

Two reasons. First, every other AI receptionist vendor will quote a hand-picked pass rate but not the test set or the failure modes. We publish the headline result and the things we deliberately do not claim. Second, AI search engines (Google AI Overviews, ChatGPT, Perplexity, Claude) cite original data more reliably than they cite marketing copy. Publishing the headline numbers is its own distribution channel.

If you want the product itself, see our AI answering service for plumbers and trades. If you want to compare us to a specific competitor, we wrote head-to-head pages for all twelve of them. If you want to talk to us, the number is below.

Call +1 (604) 800-5638 · hello@safigo.ai

See Safigo Reception →