Most "AI for brokers" comes in two flavors, and both fail the same way: nobody is watching after the demo. DeadheadDesk is the third option — a narrow service that is tested nightly, maintained forever, and gated by your reps on every send.
Option A · software you babysit
You buy a seat-licensed AI tool. It prices clean emails fine — and quietly mangles the messy ones. In published industry surveys, 61% of teams running off-the-shelf AI tools report accuracy complaints. The vendor's answer is a knowledge-base article.
Option B · the agency build
An automation shop wires up a workflow and disappears. Six months later a TMS API version bumps and quotes silently stop logging. Roughly 74% of companies report rolling back or stalling at least one AI initiative — most after the builder was gone.
FIGURES: PUBLISHED INDUSTRY RESEARCH ON ENTERPRISE AI ADOPTION, 2024–25. AGGREGATED AND DIRECTIONAL — NOT OUR DATA. THAT'S THE POINT: NOBODY WAS WATCHING.
A managed desk: two narrow agents, a nightly eval harness, a human approval gate on everything, and a named person on the hook when it breaks. The rest of this page is the contract, in plain English.
The pilot is $2,500, credited in full against the $7,500 setup if you continue. Each week ends with a deliverable you can hold in your hand. Click through the weeks.
We pick a single quotes inbox and a single corridor with real volume. Then we sit with your best rep and write down how they actually price — floors, targets, fuel posture, customers who get the friendly number. If a rule lives only in someone's head, this is the week it gets written down.
| Scope locked | 1 INBOX · 1 LANE |
| Margin-rule interviews | 2–3 SESSIONS |
| Historical email pull | 90 DAYS, SANITIZED |
| TMS access | READ-ONLY TO START |
| Baseline measured | YOUR CURRENT REPLY TIME |
Deliverable · end of week 1
CONTENTS ILLUSTRATIVE — YOUR MAP USES YOUR RULES.
The plumbing week. Inbound email starts flowing to the agent through webhooks. The TMS connector (Aljex, Tai, McLeod, Turvo, or Ascend) gets wired and tested against sandbox loads. DAT or Greenscreens rate pulls come online. Your historical emails are seeded into the eval corpus — the test set everything is graded against, forever.
| Email webhooks | LIVE, LISTEN-ONLY |
| TMS connector | ALJEX / TAI / McLEOD / TURVO / ASCEND |
| Rate feed | DAT · GREENSCREENS |
| Eval corpus seeded | FROM YOUR REAL EMAILS |
| Customer contact | NONE. ZERO. |
Deliverable · end of week 2
CHECKLIST FORMAT — ACTUAL ITEMS VARY BY TMS.
Every real quote request that hits the inbox gets a silent draft. Your reps keep working exactly as before — and grade the drafts: would have sent, would have edited, dead wrong. Every correction goes back into tuning. The nightly regression suite starts running against the corpus this week and never stops.
| Drafts produced | ON REAL INBOUND, SILENT |
| Rep grading | ~2 MIN/DAY PER REP |
| Tuning loop | CORRECTIONS → RULES → RE-TEST |
| Nightly regression | STARTS · RUNS FOREVER |
Deliverable · end of week 3
SAMPLE DATA ILLUSTRATIVE SCORECARD — YOURS SHOWS YOUR GRADES.
Drafts now land in an approval queue. A rep clicks approve, edit, or reject; nothing reaches a customer without that click. At day 30 we sit down with the before/after numbers against the Week 1 baseline and make a go/no-go call together. If it's no-go, you walk — the pilot ends at $2,500 and no setup fee is billed.
| Sending | REP-APPROVED, EVERY TIME |
| TMS logging | EVERY QUOTE, AUTOMATIC |
| Go / no-go | METRICS VS. BASELINE, NOT FEELINGS |
| If go | $2,500 CREDITS TO $7,500 SETUP |
Deliverable · day 30
THE TEMPLATE IS THE PROMISE: HARD NUMBERS DECIDE.
This is the part nobody else sells, and the part that matters most. A regression suite built from real freight emails runs nightly. Drift alarms watch confidence trends. If extraction quality slips, we get paged — you get a line item in the weekly report, not a surprise in your P&L.
| Suite | Cases | Pass | Status |
|---|---|---|---|
| Quote extraction | 412 | 99.0% | PASS |
| Margin-rule compliance | 268 | 100% | PASS · HARD GATE |
| Carrier reply parsing | 390 | 97.7% | PASS |
| Escalation triggers | 145 | 99.3% | PASS |
| Double-broker red flags | 61 | 100% | PASS · HARD GATE |
| New: TriHaul dispatch format | 12 | 83.3% | ⚑ PATCHING |
MARGIN-RULE AND FRAUD SUITES ARE HARD GATES: ANYTHING UNDER 100% BLOCKS THE AGENT ON AFFECTED PATTERNS UNTIL FIXED.
MEAN EXTRACTION CONFIDENCE ON LIVE TRAFFIC. A DIP IS A TICKET, NOT A MYSTERY. ALL VALUES SAMPLE DATA.
Day one, your reps approve everything. Months later — if the numbers hold and you say so — routine lanes can earn auto-send. You control the ladder, rung by rung, and you can step back down any time.
| Rung | What the agent does | When |
|---|---|---|
| 0 | Approve everything. Every draft held for a rep's click. This is day one and stays default. | DAY 1 |
| 1 | One-click flow. Same gate, faster queue. The agent learns from every edit your reps make. | WEEKS IN |
| 2 | Earned auto-send on specific routine lanes you designate — only after sustained accuracy on those lanes, only with your signature, revocable instantly. | MONTHS IN |
| × | Full autonomy across the desk. Not a rung. We don't sell it. | NEVER |
| Action | Why |
|---|---|
| Sending quotes on day one | Trust is measured first, granted second. |
| Pricing for a new shipper | No history, no pattern, no auto-send. A rep owns the first impression. |
| Multi-stop pricing | Stop-off fees and liability need human judgment. The agent escalates with full context instead. |
| Carrier onboarding | Vetting a carrier is a fraud surface. Humans only. |
| Anything fraud-adjacent | Double-brokering signals, identity mismatches, changed remit-to details — these page a person, always. |
Every Monday you get this document. Not dashboards, not vibes — what the desk did last week, in loads and dollars. Toggle between two sample weeks to see how it reads as trust is earned.
Summary · week of (sample)
| Metric | This week | Baseline (week 1) |
|---|---|---|
| Quote requests received | 186 | — |
| Quotes answered (rep-approved) | 171 | ~140 TYPICAL |
| Escalated to reps (correctly out of scope) | 15 | — |
| Median response time (draft → approved → sent) | 11 MIN | 4 HR 05 MIN |
| After-hours requests answered next-AM queue | 23 | MOSTLY UNANSWERED |
| Loads under track & trace | 94 | — |
| Check-ins completed without a phone call | 86 (91%) | 0 |
| Exceptions escalated | 8 | — |
| Approved-without-edit rate | 81% ↑ | 79% (WK 3 SHADOW) |
Dollars · illustrative model
| After-hours quotes that converted to covered loads | 6 |
| Est. margin on those loads (your avg $250/load) | $1,500 |
| Rep hours not spent on check calls (~12 min ea × 86) | ~17 HRS |
ILLUSTRATIVE MODEL — MARGIN ATTRIBUTION USES YOUR TMS NUMBERS, FLAGGED WHERE CAUSALITY IS SOFT.
Exceptions caught this week
Billing · included items
| Items processed (quotes + check-ins) | 280 / WK · ~1,120 MO PACE |
| Plan includes | 500 ITEMS / MO |
| Projected overage @ $1.50/item | ~620 × $1.50 = $930 |
OVERAGE IS SHOWN WEEKLY SO THE INVOICE IS NEVER A SURPRISE.
Maintenance notes
Nightly regression: 7/7 green nights. One parser patch (carrier reply format) deployed Thursday, re-tested same day. No action needed from you.
STATUS: RUNG 0 — EVERY SEND REP-APPROVED. GO/NO-GO MEETING SCHEDULED DAY 30.
Summary · week of (sample)
| Metric | This week | Trend |
|---|---|---|
| Quote requests received | 214 | ↑ |
| Quotes answered | 203 | ↑ |
| · rep-approved sends | 141 | — |
| · auto-sent on your 3 designated routine lanes (Rung 2) | 62 | YOU SIGNED THESE LANES |
| Median response time (auto-sent lanes) | 96 SEC | ↓ |
| Median response time (gated lanes) | 9 MIN | ↓ |
| After-hours requests answered | 31 (14 AUTO ON SIGNED LANES) | ↑ |
| Loads under track & trace | 128 | ↑ |
| Check-ins completed without a call | 119 (93%) | ↑ |
| Exceptions escalated | 9 | — |
| Approved-without-edit rate (gated lanes) | 91% | ↑ |
| Auto-send accuracy audit (weekly spot-check, 20 sampled) | 20/20 WITHIN RULES | HARD GATE |
Dollars · illustrative model
| After-hours quotes converted to covered loads | 11 |
| Est. margin on those loads | $2,750 |
| Rep hours returned (check calls + drafting) | ~26 HRS |
ILLUSTRATIVE MODEL — SAME CAVEATS, EVERY WEEK, IN WRITING.
Exceptions caught this week
Billing · included items
| Items processed | 331 / WK · ~1,320 MO PACE |
| Plan includes | 500 ITEMS / MO |
| Projected overage @ $1.50/item | ~820 × $1.50 = $1,230 |
Maintenance notes
Rate feed schema change (vendor-side) absorbed Tuesday — caught by the 2 AM suite, patched before your morning. Rung-2 lane audit: all clear. Your standing right to revoke auto-send is one email away.
STATUS: RUNG 2 ON 3 LANES · RUNG 0/1 EVERYWHERE ELSE · FRAUD PATHS ALWAYS HUMAN.
The $2,000/month isn't a license fee — it's a maintenance crew. Email formats change, TMS vendors ship breaking API versions, models get upgraded. All of it is our problem, on our clock, covered by the retainer. Here's exactly what that means.
| When this happens | What we do | Target |
|---|---|---|
| A carrier changes their email or SMS reply format | Nightly regression flags it; affected pattern routes to humans (no silent guessing); parser patched and re-tested | NEXT BUSINESS DAY |
| Your TMS ships a breaking API version | Connector updated on our side; sandbox-verified before it touches live loads | PRIORITY · ON US |
| DAT / Greenscreens changes a feed schema | Absorbed in the integration layer; you never see it except as a line in the weekly report | SAME WEEK |
| The underlying model gets upgraded | Full regression re-baseline against your corpus before any swap; rollback if scores dip | NO SWAP WITHOUT GREEN SUITE |
| You change your margin rules | Rules updated, re-tested against history, deployed — with a diff in the weekly report | 2 BUSINESS DAYS |
| Processed-item volume drops abnormally (the classic silent failure) | Volume-anomaly alarm pages us; we call you — not the other way around | ALARM < 1 HR |
| Anything breaks at 2 AM | Monitoring pages the founder. Your reps find out from us, with a fix timeline, in the morning | PAGED, NOT POSTED |
ALL OF THE ABOVE IS INSIDE $2,000/MO (INCL. 500 PROCESSED ITEMS; $1.50/ITEM AFTER). MAINTENANCE IS NOT AN UPSELL. MAINTENANCE IS THE PRODUCT.
A narrow service stays reliable by staying narrow. These are the things we turn down — including the ones that would be easy money.
Aljex, Tai, McLeod, Turvo, Ascend. That's the list. A connector we can't regression-test nightly is a silent-failure factory, so we don't wing it — even for a signed check.
Some prospects ask for it. The answer is no, even when it costs us the deal. The trust ladder exists because your customers can't tell our mistake from yours.
Phone calls are harder to parse, harder to test, and harder to audit than email and SMS. We don't ship what we can't put through the nightly suite.
No "and also insurance, and also recruiting." Freight brokerage email, two agents, done well. Depth is the moat; sprawl is how option B happens.
SAYING NO IS CHEAPER THAN CHURN — FOR BOTH OF US.
Thirty seconds, three questions, an honest answer. We qualify hard because a bad-fit pilot wastes your $2,500 and our month.
$2,500 pilot, credited to setup if you continue. If the before/after sheet doesn't earn the next month, walk — we'll hand you the data on the way out.
NO AUTONOMOUS SENDING ON DAY ONE. NO EXCEPTIONS — SEE THE "NO" LIST ABOVE.