Proof & Results — DeadheadDesk

Desk telemetry · what the dashboard tracks

Median draft time

0 SEC

Inbound timestamp → priced draft sitting in the rep approval queue.

SAMPLE DATA

Items processed this week

Quote emails read + carrier check-ins handled. Same unit as your invoice.

SAMPLE DATA

Eval-suite accuracy (nightly)

Field-by-field grading against a corpus of real freight emails, every night.

SAMPLE DATA

How we measure this — the eval harness, in plain English

Median draft time. The clock starts when the email hits your quote inbox and stops when a priced draft lands in the approval queue. Rep approval time is reported separately — we don't hide the human gate inside the headline number.

Items processed. Every quote email read and every carrier check-in handled counts as one item. It's the exact unit on the invoice: 500/month included, $1.50 each after. No creative accounting between the dashboard and the bill.

Eval-suite accuracy. Every night, the agents re-run a corpus of real, permissioned, redacted freight emails with known correct answers — origin, destination, equipment, dates, price band — and get graded field by field. A scenario passes only if every load-critical field is right. If the pass rate sags, a drift alarm pages us. You find out from the weekly report, not from a wrong quote.

THE TILES ABOVE SHOW SAMPLE VALUES FROM OUR TEST HARNESS — PRE-LAUNCH, THERE IS NO CUSTOMER TRAFFIC TO SHOW. PILOT CLIENTS SEE THEIR OWN TILES, LIVE, FROM WEEK ONE.

The industry response-time benchmark

How long does a quote request sit?

We mystery-shop brokerages with realistic spot-quote emails and time the replies. Before the full 50-brokerage study publishes, here's the shape of the early data — and a question for you.

MYSTERY-SHOP · REPLY-LATENCY DISTRIBUTION ILLUSTRATIVE — PENDING OUR PUBLISHED 50-BROKERAGE STUDY

Step one: guess where you land

On a typical weekday quote request, how fast does your desk send a priced reply?

METHOD: IDENTICAL DRY-VAN QUOTE REQUESTS SENT TO BROKERAGE QUOTE INBOXES DURING AND AFTER BUSINESS HOURS. BUCKET = TIME TO FIRST PRICED REPLY. "NEVER" = NO REPLY WITHIN 5 BUSINESS DAYS. EARLY-SAMPLE FIGURES SHOWN; THE FULL STUDY (N=50, METHODOLOGY AND ANONYMIZED RAW BUCKETS) PUBLISHES ON THIS PAGE.

Case files · three slots, zero fabrications

The before/afters we will publish.

We could mock up glowing case studies tonight. Instead, here are the actual one-pagers — blank, waiting on three design partners. The metric rows are already locked. The partner fills them in at pilot day 30, wins and misses both, sign-off required before anything publishes.

CASE FILE 01BEFORE / AFTER

Design-partner slot — reserved

Slot 01 · 5–12 seats, quote desk drowning after 17:00

Median response latency	→
Loads covered / mo
After-hours capture rate	→
FTE-hours redeployed / wk

PUBLISHES AT PILOT DAY 30 · SAME LANES, SAME STOPWATCH · MISSES INCLUDED

Claim this slot →

CASE FILE 02BEFORE / AFTER

Design-partner slot — reserved

Slot 02 · 10–20 seats, track-&-trace eating two FTEs

Median response latency	→
Loads covered / mo
After-hours capture rate	→
FTE-hours redeployed / wk

PUBLISHES AT PILOT DAY 30 · SAME LANES, SAME STOPWATCH · MISSES INCLUDED

Claim this slot →

CASE FILE 03BEFORE / AFTER

Design-partner slot — reserved

Slot 03 · 20–30 seats, night coverage without a night shift

Median response latency	→
Loads covered / mo
After-hours capture rate	→
FTE-hours redeployed / wk

PUBLISHES AT PILOT DAY 30 · SAME LANES, SAME STOPWATCH · MISSES INCLUDED

Claim this slot →

DESIGN PARTNERS GET PRIORITY INTEGRATION AND A SAY IN THE ROADMAP. IN EXCHANGE, THEIR NUMBERS GO ON THIS PAGE — GOOD OR BAD. THAT'S THE DEAL.

The honest-limitations log

Where it breaks. And how we catch it.

Every agent fails somewhere. The product isn't "it never fails" — the product is failure that gets caught, measured, and fixed before it costs you a load. These are real failure classes from our test corpus, with the monitoring response for each.

FAILURE REGISTER · DD-05/FMONITORED CONTINUOUSLY

Failure class	What it looks like	How the harness catches it
Sender format drift	A regular shipper updates their TMS; their quote emails change layout and extraction confidence sags.	Per-sender confidence tracking + nightly regression. Drift alarm inside 24 hours; corpus updated; suite re-run green before the next shift.
Ambiguous equipment	"Need a van" — dry or reefer? 53' or sprinter?	Never guessed. The draft asks the confirming question in plain text, and the ambiguity is flagged to the approving rep.
Stale or outlier market rate	The rate feed hiccups and returns a number 40% off your lane history.	Sanity band against your own TMS lane history. Outside the band, the draft is held and flagged — it does not price.
Garbage carrier replies	A driver answers a check-in with a photo of a BOL, a voicemail transcript, or just "yes."	Parse-confidence threshold. Below it, the raw message routes to a human with the full load context attached. No status gets written on a guess.
TMS / API breakage	An auth token expires at 2 AM and writes silently stop.	Heartbeats on every connector. Silent-failure alerting pages us, not you — and any gap shows up in the weekly report either way.
Model behavior shift	An upstream model upgrade quietly changes how drafts read or price.	Model versions are pinned. Candidates run in shadow against the full eval suite before any swap; the diff is reviewed by a human first.
Double-brokering red flag	A carrier's MC, callback number, and dispatch email don't line up.	Hard escalation, always. The agent never green-lights a suspect carrier — it packages the evidence and hands it to your rep.

THIS LOG GROWS. EVERY NEW FAILURE CLASS FOUND IN PRODUCTION GETS A REGRESSION TEST IN THE NIGHTLY SUITE AND A ROW ON THIS PAGE.

The Monday 06:00 artifact

One email a week. Denominated in dollars.

Pilot clients get this every Monday before coffee. Here's the condensed template, filled with sample data so you can see exactly what gets counted — including the week something broke. The full walkthrough lives on the How It Works page.

WEEKLY DESK REPORT · WK 24 SAMPLE DATA

View week:

Line item	Count	Δ vs prior wk
Quote requests read
Drafts held for rep approval
Approved unchanged
Approved with edits
Rejected by reps
After-hours drafts (18:00–08:00)
Carrier check-ins completed
Exceptions escalated to humans
Nightly eval suite
Margin on approved quotes that booked

ALL FIGURES SAMPLE DATA FROM THE TEST HARNESS. "MARGIN" = QUOTE DRAFTED BY THE DESK → APPROVED BY A REP → SHIPPER BOOKED. EVERY LINE IS A COUNT OR A DOLLAR — NO VIBES.

Where the numbers will come from

Every claim gets a before and an after.

01 · Baseline · day 0

Measure the desk you have

Before anything switches on: your current reply latency, pulled from your own sent-mail timestamps plus an outside mystery shop. That's your "before," on paper. You keep it even if you walk.

02 · Shadow grading · wks 2–3

Grade against your reps

The agent drafts silently on your real inbox while reps work as usual. Every draft is graded against what the rep actually sent — extraction, price, tone. Misses become regression tests.

03 · Before/after · day 30

Read the delta, decide

Same inbox, same lanes, same stopwatch. The pilot's go/no-go reads off this delta — and design-partner deltas publish on this page with sign-off, wins and misses both.

NOTHING ON THIS SITE BECOMES A MARKETING CLAIM UNTIL IT'S A MEASURED DELTA WITH A DATE ON IT.

Measured or it didn't happen

Your "before" number is free.

The response-time audit mystery-shops your own quote inbox and plots your desk on the distribution above. You keep the number either way — it's the "before" column of your case file, whether or not we ever fill in the "after."

Get your free response-time audit Start the 30-day pilot — $2,500

PILOT FEE CREDITS AGAINST THE $7,500 SETUP. EVERY SEND GATED BY YOUR REPS FROM DAY ONE.