What show rates did AI-booked meetings achieve compared to human-booked meetings?

In the 90-day test, AI-booked meetings had show rates of 40-60%, while human-booked meetings ran 70-85%. That gap inverted the cost-per-meeting math once someone divided total cost by actual shows at the end of the quarter.

How does the SQL conversion rate differ between AI and human SDRs?

AI-booked meetings converted to qualified opportunities at 15-20%, compared to 25% for human-booked meetings. Combined with the lower show rate, this compounded the cost disadvantage of the AI outbound stack.

What is the 30-minute handoff rule for a hybrid AI and human SDR model?

The fix was a strict SLA: the AI owns the first five touches, and any reply containing a question mark or a pricing signal gets routed to a human within 30 minutes. This recovered show rates to the mid-seventies and cut overall SDR spend significantly.

What are the three conditions a company should meet before deploying a hybrid AI SDR setup?

ACV must be under $20K, the company needs a specific ICP with distinct outbound campaigns, and there must be a dedicated manager committing at least 10 hours per week to oversee the system.

What single metric should founders track weekly to know if their AI SDR system is healthy?

Revenue per held meeting is the weekly dashboard number that signals whether the hybrid system is working or quietly degrading.

Should founders under $500K ARR use an autonomous AI outbound stack?

No. The episode advises founders below $500K ARR to start with a human SDR augmented by AI tools rather than building a fully autonomous outbound stack, since the infrastructure and management overhead of a hybrid model requires more revenue base to justify.

AI SDR vs. Human SDR: One Founder's 90-Day Controlled Test — ARR Autopsy

Derek Simmons: Forty to Sixty percent. That's what your AI SDRs show rate is like. Your human reps, Seventy to Eighty-five. Nobody ran the math Elena Reyes: Mm Derek Simmons: until Elena Reyes: Mm. Derek Simmons: end of quarter. Elena Reyes: Wait, wait, wait. The whole cost argument flips once you divide by show rate. Derek Simmons: Completely flips. And that is exactly where today's episode starts. Elena Reyes: Welcome to ARR Autopsy. I'm Elena Reyes. Derek Simmons: Derek Simmons, we dissect the revenue moves that worked. The ones that blew up and the ones that looked great until someone finally checked the actual math. Elena Reyes: Today's guest hit 1.4M ARR with two human SDRs, pipeline that looked fine on paper, and a nagging feeling something was wrong. So they ran a real 90-day AI versus human test on identical lists. Derek Simmons: Controlled experiment. Same lists. That detail matters. Elena Reyes: Right. And the scoreboard is a split verdict. AI won on volume and follow- And follow-up consistency lost on reply quality, deliverability, and cost per qualified opportunity once SQL conversion rates hit the picture. Speaker 3: Mm-hmm. Derek Simmons: Okay, so get this: the hybrid fix that solved the quality problem created three new fires. Resentful human SDRs letting CRM data rot, a warm domain burned by volume, no handoff route. Elena Reyes: Wow. Love when the solution has its own bugs. Classic. And we close out with the decision framework. 200K ARR to 5 million. What conditions have to be true before you build hybrid infrastructure and the one weekly metric that tells you if the system is quietly degrading? Derek Simmons: So what does this mean for the person listening right now? Probably a lot. Let's get into it. The cost per meeting number looked incredible on paper, and then someone divided by show rate. Elena Reyes: Oh no, what happened when they divided? Derek Simmons: The math inverted. Here's the setup. Founder at 1.4 million ARR, two human SDRs, one AI SDR agent running in parallel for 90 days. Identical ICP lists, identical copy briefs. Clean test. Elena Reyes: Okay, so actually controlled. That matters. Derek Simmons: It really does because the AI looked great right up until Q2. two. Cost-per-meeting, way down. Volume, way up. Salesmotion published benchmarks showing AI SDRs deliver outreach of 20 to 60 percent of the cost of a human. The headline is real. Elena Reyes: Yeah, the vendor deck math always checks out, until it doesn't. Derek Simmons: Until it doesn't. So here's where it gets good. The show rate on AI-booked meetings were running 40 to 60 percent. Human-booked meetings, 70 to 85 percent. Elena Reyes: Wait, so you're telling me nearly half the AI meetings were no-shows? Derek Simmons: At the bottom of that range, yes. Elena Reyes: Wow. Derek Simmons: Your AEs calendar fills up, looks totally healthy, and roughly half of the slots are ghosts. Elena Reyes: That is a silent budget leak, the kind that doesn't show up until someone pulls the actual held meeting number. Derek Simmons: Which nobody pulled until end of quarter. Elena Reyes: Right, right, right. And that's the whole thing. Cost-per-meeting is a vanity stat if you don't divide by show rate. Great. You're really buying cost-per-held-meeting. Exactly. And once you do that math, the gap between AI and human narrows a lot faster than the pitch deck suggested. So help me stress test this for a second. Was the show rate problem the AI or was it the ICP, the copy, the sequence? Because those feel like separable problems. Great question. And honestly, that's what the next 90 days became. But to even get there, first you need to understand where this founder was. was before the test started. Right. What was the actual state of the business, the ARR, the team? What had already broken? Derek Simmons: Because 1.4 million ARR with two human SDRs and a pipeline problem is a very specific situation. And the reason they ran a controlled test instead of just switching everything over, That's worth sitting with. Elena Reyes: Yeah, why not just flip the switch? Derek Simmons: That's the question. So here's the actual situation before the test started. At 1.4 million ARR, the founder had two human SDRs. Fully loaded, each rep was running about 130K a year. That's salary, benefits, tools, the whole stack. Elena Reyes: And what was the meeting to opportunity rate? Like, what were those reps actually producing? Derek Simmons: That's the thing. On paper, solid. About 15 booked meetings per rep. rep per month. But the underlying pipeline felt thin. SQLs weren't closing at the rate the ARR growth required. Elena Reyes: Okay, but let me stress test that a little bit. Was the pipeline actually thin or were they looking at the wrong number? Because 15 meetings sounds fine on the surface. Derek Simmons: Right. And that's exactly what they couldn't answer cleanly. Nobody had broken it down to cost per qualified opportunity yet. They were watching cost per meeting. Which, Elena Reyes: So Derek Simmons: as we covered, is where the trouble starts. Elena Reyes: the reps were hitting activity targets. The math just hadn't been done on what those meetings were actually worth downstream. Derek Simmons: Exactly. And then you layer on the tenure problem. According to Bridge Group data, your average SDR stays about 14 to 16 months. Three of those months are ramp. So you're getting roughly a year of peak output, then you're restarting. A year. Elena Reyes: You pay to recruit, pay to ramp for a quarter. Get twelve months of real production then do it all over again. Derek Simmons: I've seen this movie Elena Reyes: Yeah. Derek Simmons: before, and it's brutal on pipeline predictability. Every time someone walks, you're three or four months from getting back to steady state. Elena Reyes: So what did they already tried before reaching for AI? Because I doubt the first instinct was "let's automate this. Derek Simmons: Oh no, they'd tried the standard playbook: third party leads lists, a fractional SDR, even ran one rep on a pure cold call motion for a quota. For a quarter, nothing got the meeting to SQL rate they needed. Elena Reyes: And the fractional SDR--specifically what happened there? Derek Simmons: Three months, eight meetings booked, two showed. That's your whole problem in miniature. Elena Reyes: Oh, man, two showed. So, by the time they're looking at AI SDRs, they're not doing it because a vendor had a good deck, they're doing it because the human motion keeps resetting, and they can't afford another ramp cycle eating into runway. Which is why the controlled test matters so much they didn't just flip the switch. Derek Simmons: No! Elena Reyes: And this is where it gets good. Same lists, same ICP, same sequences handed to both the AI and the two humans. Derek Simmons: 90 days, clean split. The whole point was to make sure the data would actually mean something at the end. Elena Reyes: That's the setup that makes this worth talking about, because without the controlled structure, you're just getting a vendor case study. Derek Simmons: Exactly. And 90 days later, they had a spreadsheet. That's where we're going next. The actual numbers, side by side, what the AI booked, what the humans booked, and what happened when someone finally checked who actually showed up. Thanks for watching. So the scoreboard is open; ninety days, identical lists. Walk me through the raw numbers first. Elena Reyes: Okay, so the AI booked more meetings, full stop. Volume was up roughly three to four times what the human reps produced over the same window. And the follow up cadence? The AI hit every single touchpoint. Humans skipped follow ups constantly. Derek Simmons: Shocker. Elena Reyes: Right? But here's where it gets interesting. Auto Interview AI's benchmarks have human SDRs responding to inbound leads in 42 to 47 hours on average. The AI sub 60 seconds every single time. Derek Simmons: The founder tracked that Explicitly? Elena Reyes: Explicitly. And on inbound, that speed advantage translated directly to booked meetings. No debate on that one. Derek Simmons: Okay, so volume up, speed up. Now give me the number that actually matters. Elena Reyes: Cost per qualified opportunity-that's the verdict number, not cost per meeting-and when you divide all the way down to SQL, the gap narrows dramatically. Derek Simmons: Because the show rate hurt them earlier but the SQL conversion rate hurt them again at the next stage. Elena Reyes: Exactly. According to data from Apollo, AI-booked meetings convert to fifteen to twenty percent qualified opportunities. Human-booked meetings run around twenty five percent. So you book more, fewer show, and fewer of those convert. That math compounds fast. Derek Simmons: That's the volume up, quality down pattern showing up at every funnel stage, but let me ask the uncomfortable one—what did the AI actually say when a prospect pushed back on pricing? Elena Reyes: Ooh, that's where it got ugly. The auto reply handling was... not good. Derek Simmons: Define not good. Elena Reyes: The AI responded to a pricing objection... objection with a generic feature dump. No context, no acknowledgement of the objection, just a wall of capability bullets. The prospect replied back asking who they were actually talking to. Derek Simmons: And that's not an edge case. According to Autopsy AI's research, AI handles the top 10 common objections acceptably, but humans handle the other 90% that need real thinking. Pricing questions live in that 90%. Elena Reyes: And the deliverability problem compounded everything. AI emails got spam flagged at 8% versus 3% for human-written outreach on a 100,000 email analysis. Speaker 3: Wow. Elena Reyes: That's nearly three times the spam rate on the same domain. Derek Simmons: So the reply gap was narrowing, which is the story the vendors lead with, but the domain was quietly burning underneath it. Elena Reyes: That's the part that decided the verdict. Not cost per meeting, not even show rate. It was the cost per qualified opportunity once you factored in conversion and domain damage. Derek Simmons: Bridge Group's data puts hybrid pods at 54% lower cost per qualified opportunity versus human only, but pure AI, worse. Elena Reyes: Worse, the founder's own spreadsheet matched that directionally. Pure AI motion failed at quality at every downstream stage. Derek Simmons: And that is exactly the question the next 90 days had to answer. Answer, if the pure AI motion broke at quality, what's the specific fix because they didn't shut it down? Elena Reyes: No, they didn't, and the hybrid structure they built is the piece worth unpacking. Derek Simmons: So the scoreboards said hybrid, but building the hybrid, That's where it got messy. Elena Reyes: Fast, because the first problem wasn't the AI, it was the humans. Derek Simmons: Of course it was. Elena Reyes: The two SDRs on staff saw the AI tool as a threat, and when people feel threatened, they're not exactly rushing to clean up the CRM data that feeds the system they resent. Derek Simmons: This is the part that never shows up in vendor case studies. Dirty CRM means the AI targets worse over time. Over time, Apollo actually flags this explicitly: duplicate records, missing job titles, stale company data, all degrade AI output quality. The garbage in problem compounds at volume. And volume was exactly what made it worse. The AI was running cold outbound at scale on their main sending domain before anyone had a real warm-up protocol. DevCommX has written about this. Sender reputation, once damaged, can take months to recover. Elena Reyes: Cover.--You don't notice it happening until open rates are already in the floor. Derek Simmons: So the CRM's dirty, the domain's getting cooked and the humans are quietly routing around the system, three fires at once. Elena Reyes: Classic scaling debt: pay now or pay later. And they paid later. Derek Simmons: Okay, so walk me through the actual fix, because I don't want vague process talk here. What specifically changed? The rule they wrote was simple enough to put on a sticky note. Elena Reyes: AI owns touches one through five and handles any inbound response within 60 seconds. The moment a reply comes in with a question mark or the word pricing or anything that signals real intent, it routes to a human within 30 minutes. Derek Simmons: Hold on, thirty minutes not sixty seconds? Elena Reyes: For the handoff, Yeah. The AI fires an instant acknowledgement, buys the human time, but the human has to be in the thread within thirty minutes with actual context, not a template. Derek Simmons: Okay, and did that move the needle on show rates? Elena Reyes: That's fair to push on. The show rate improvement on hybrid booked meetings came about six weeks of running that rule consistently. They got back to the mid seventies range, which is closer to what the human human only meetings had been doing. Derek Simmons: So that one hand off rule basically closed the show rate gap Speaker 4: Yes. Derek Simmons: -the thing that was bleeding them out at the start of the episode. Elena Reyes: Most of it. The other piece was restructuring the pod-one human SDR per roughly two to three AI sequences running simultaneously. The humans' job title basically became reply triage and relationship escalation. No cold list work at all. Derek Simmons: What did that cost versus the old model? Elena Reyes: One human SDR running oversight on AI sequences costs somewhere in the $60,000 to $80,000 fully loaded range. Derek Simmons: Compare that to the $260,000 they were burning on two fully independent human SDRs. The AI stack itself, a mid-range platform, was running around $900 to $1,500 a month on top. Elena Reyes: So, real talk. The math works if the handoff rule holds. Break the rule, the show rates slip, and the whole cost argument falls apart. Derek Simmons: Exactly. The SLA isn't a nice to have, it's the load-bearing wall. Elena Reyes: The system is only as good as the worst handoff week. Which sets up the real question, what does a founder do if they're watching this from $500,000 ARR, not $1.4M? Is this even worth building yet? So, what does all this mean for the person sitting at $500K ARR right now, genuinely wondering if they should build this whole hybrid pod? Derek Simmons: Honestly, the honest answer is probably not yet. And here's why: at 500K ARR, you likely haven't locked in your ICP tightly enough to clone the playbook. The AI clones what works. If you're still figuring out what works, you're automating noise. Elena Reyes: That's the thing nobody says in the vendor demo. They show you volume, they don't ask whether your messaging is actually proven. Derek Simmons: Okay, but let me stress test that assumption with some specifics. The ACV question is real. The AI-SDR playbook piece puts it pretty bluntly: $1 to $20,000 ACV is the sweet spot. AI qualifies and books, humans close. Above $20,000, you want a human doing most of the work, with AI on first touch and research only. Elena Reyes: And the math backs that up: the product growth.blog piece also flags that if revenue per meeting from your AI pipeline is below 50% of your human pipeline, you need to fix segmentation before you add any volume. Which is exactly the trap our founder walked into in segment three: high volume, low SQL conversion. Those two numbers compounded badly. Turns out multiplying bad math by 10 just gives you more. More Bad Math Faster, at Scale So the decision framework I'd hand someone right now? Three conditions have to be true before you spend a dollar on AI SDR infrastructure: (one) your ACV is under twenty thousand; (two) your ICP is specific enough that you can write five distinct campaigns for distinct segments, not one campaign blasted at everyone; (three) you have someone who can actually manage the system. Derek Simmons: Minimum ten hours a week. Elena Reyes: That last one is the silent killer. The product growth to unplug playbooks cites SaaStr spending 15 to 20 hours weekly just managing their AI SDR deployment, Derek Simmons: Wow. Elena Reyes: and performance dipped when that person got busy with other work. Derek Simmons: Right. This isn't a set it and forget it channel. Elena Reyes: Real talk for a second. If you're below 500K ARR and you're thinking about spinning up an AI SDR tool, The better move is probably a strong human SDR with AI augmentation on the research and sequencing side. You get the signal without burning your domain. And Landbase had data on this. Sales tech companies running AI-augmented human reps are hitting pipeline velocity gains without the deliverability risk of a full autonomous outbound stack. Derek Simmons: Short pause. Elena Reyes: Short pause. One number to watch on your dashboard weekly, whatever stage you're at. you're at. Revenue per held meeting, not meetings booked. If that number's dropping while volume is climbing, the system is quietly degrading. Derek Simmons: That's the one metric that doesn't lie: cost-per-booked-meeting is vanity stat without it. Elena Reyes: I've seen this movie before: the dashboard looks great, pipeline feels thin, and Nobody divided by show rate until end of quarter. Derek Simmons: Which is exactly how this whole episode started, full circle. Elena Reyes: And that's your Autopsy right there. That's a wrap on this one. And honestly, Elena, if there's one thing from today that I keep coming back to, the calendar that looks full but half the slots are ghosted. Derek Simmons: Mm-hmm. Speaker 3: Right. Cost per meeting as a vanity stat. The moment you divide by show rate, the whole math flips. Elena Reyes: That reframe alone is worth a listen: cost per held meeting. Write it on your whiteboard. And the fix wasn't ditch the AI; it was a 30-minute SLA and a clean handoff rule. Boring answer: works. Speaker 3: Boring answers usually do. Elena Reyes: If this one saved you from a bad bet, do us a favor: share it with one founder who needs it. Derek Simmons: Subscribe on YouTube or wherever you're listening and drop a review. That's what keeps real operators talking to us. Elena Reyes: We'll see you next time on ARR Autopsy. Speaker 3: Take care, everyone.

AI SDR vs. Human SDR: One Founder's 90-Day Controlled Test

AI SDR vs. Human SDR: One Founder's 90-Day Controlled Test

Show Notes

Frequently Asked Questions

Sources

Transcript

Key Takeaways