What is the visibility gap in enterprise AI agent security?

The CSA 'Autonomous but Not Controlled' survey found that 68% of organizations claim high visibility into their AI agents, yet 82% discovered unknown agents in their environments in the past year. Every organization that experienced an AI agent incident reported real business impact, including data exposure, operational disruption, and financial losses.

How are shadow AI agents different from shadow SaaS?

Shadow agents are structurally more dangerous than shadow SaaS because they use OAuth connected under the employee's own identity, making their API calls appear completely legitimate. They operate autonomously with persistent credential access to CRM, email, and cloud infrastructure, and non-human identities now outnumber humans by more than 100-to-1 in many environments.

What was the n8n supply chain attack and how did it work?

The n8n attack, tracked as CVE-2026-21858 and GHSA-77g5-qpc3-x24r, involved malicious npm packages disguised as community nodes, specifically a fake Google Ads integration, that silently exfiltrated OAuth tokens. Community nodes run at n8n's own privilege level with full file system, network, and credential access, making the attack especially severe.

What does Microsoft Agent 365 actually cover and where does it fall short?

Microsoft Agent 365, which went GA on May 1 at $15 per user, covers managed Windows endpoints inside the Microsoft identity perimeter built on Defender, Intune, Entra, and Purview. Its core gap is personal account access: LayerX research shows 71.6% of GenAI tool access flows through non-corporate accounts, which are completely invisible to Agent 365.

What is the zombie agent problem in AI governance?

Only 21% of organizations have a formal agent decommissioning process. When an agent is deleted from a platform, its OAuth tokens and API keys often remain live, creating zombie agents that retain valid access credentials long after they are supposed to be gone. This retirement debt is one of the most underreported risks in the CSA report.

What are the concrete detection steps recommended for finding unauthorized AI agents?

Pull the Entra ID app consent report filtered for high-privilege OAuth scopes created in the last 90 days, and hunt DNS and proxy logs for outbound LLM API traffic originating from unapproved automation hosts. The LiteLLM supply chain compromise affecting versions 1.82.7 and 1.82.8 confirms this attack surface is actively exploited.

Shadow Agents at Scale — Autonomous Autopsy

Derek: Sixty-eight percent of organizations say they have strong visibility into their AI agents. Eighty-two percent found unknown agents running in their own infrastructure. Those two numbers are from the same survey. Max: Same respondents. Derek: Same respondents. The Cloud Security Alliance dropped that in their Autonomous but Not Controlled report. And Max, the gap between what security teams believe and what's actually running is exactly where this episode lives. Max: And it gets real fast. Derek: Sixty-five percent of those orgs had an actual incident in the past year. Data exposure, operational disruption, financial losses. Zero respondents said zero business impact. Zero. Not one. So, today on Autonomous Autopsy, we're doing a full dissection. I'm walking the technical kill chain on the N8N supply chain attack, CVE-2026-21858, malicious NPM packages disguised as community nodes. Nodes quietly draining OAuth tokens out of no-code automation stacks. And I'm pressure testing the defensive side. Microsoft Agent 365 just went generally available on May 1st. $15 a user. We'll get into what it actually covers versus what's still a gap. And spoiler, there's a structural gap no current tool handles. We're also hitting the governance debt problem. Only 21% of organizations have a formal process to decommission agents. agents. The rest? Zombie agents, live OAuth tokens, Max: Yeah Derek: keys that never got rotated, Retirement debt. That's the phrase, and it's accurate. And we close with two concrete detection queries you can run Monday morning, not review your security posture, Actual queries. Okay, let's get into it. First up, the numbers that broke this thing open. Sixty-eight percent of enterprise security teams say they have high visibility into the AI agents running on their networks. That's the CSA survey number. Sixty-eight percent confidence. Sounds reasonable. Yeah, yeah, yeah. And eighty-two percent of those organizations discovered AI agents they didn't know existed in the past year. Max: Wait, eighty-two percent? Derek: Yeah, yeah. And forty-one percent found unknown agents multiple times. Multiple times. This is from CSA's Autonomous But Not Controlled report. Four hundred eighteen IT and security professionals, January 2026. So two thirds of the room is raising their hand saying we've got great visibility and then four fifths of the room found surprise agents anyway. The confidence is doing a lot of heavy lifting there, truly carrying the team. Here's the thing though, where are these agents showing up? Not some obscure corner of the stack—internal automation and scripting environments—fifty-one percent of orgs; LLM platforms with custom plug-ins and assistants—forty-seven percent. These are the exact environments where sanctioned agent adoption is also accelerating right now. So you can't just quarantine the problem. The same tooling that's producing your legitimate agents Max: Right. Derek: is producing the shadow ones. That's exactly it. And Max, the CSA data on- On actual incidents is where this stops being an abstract governance complaint. Okay, hit me. Sixty-five percent of these organizations had an AI agent security incident in the past 12 months. Of those, 61% reported data exposure, 43% operational disruption, 35% financial losses, and the CSA blog noted every single respondent who had an incident reported real business impact. Impact: zero. Reported nothing: zero. Not one company said, eh, minor inconvenience. Not one. And this is happening while security leaders are sitting there genuinely Max: Yeah. Derek: believing they have the picture. That's the part that keeps me up. The visibility gap is already producing real damage. So the natural question is, if these agents are slipping past everything we have, what exactly are they? What makes a shadow AI agent fundamentally different from the sh- From the Shadow SaaS problem security teams have been managing for the last decade. So here's what separates twenty twenty six from the chatbot era. Structurally, these are different threat classes. Max: Right, and that's the reframe we need. Shadow SaaS was a user copying data into a browser tab. Shadow agents are OAuth-connected autonomous systems carrying persistent credentials to your CRM, your email, SharePoint, cloud infrastructure, running on the employee's own identity. Derek: On their own identity. So the API calls look like this. Max: look completely legitimate. Standard SaaS monitoring sees nothing wrong. Nothing, because the agent authenticated correctly. It was authorized. The problem is no human is actually behind the wheel. And here's the scale problem, Derek. According to a CSA research note from April, a 2026 survey of 235 large enterprise CISOs found that 92% lack full visibility into their AI agent identities. Ninety two, and wait for it, ninety five percent say they doubt they could detect or contain a compromised agent. That's not a vibe. Those are named CISOs in a named survey. So we're basically saying one in five security leaders thinks they could actually handle this? Derek: One in five, yeah. Max: Okay, and there's a number that lands even harder for me. Omada Identity's State of Identity Governance 2026 discussion cites non-human identities. Service accounts, automation bots, Agents: outnumbering human users by more than a hundred to one in some enterprise environments. Derek: A hundred to one. Max: Your perimeter used to be human logins. That's not the perimeter anymore; the nonhuman identity surface is where defenders are actually flying blind. Derek: In a separate CSA survey, the one commissioned with Aembit (two hundred and twenty eight respondents) found that Sixty-eight percent of organizations can't clearly distinguish between human and AI agents. Speaker 3: In an AI agent activity, Derek: Seventy-four percent say agents routinely get more access than they need, and Thirty-one percent let agents operate directly under human user identities, so the agent is you legally and technically from the system's perspective. Which is why the CIO piece framed it perfectly-the real risk has shifted from what users tell an AI to what Autonomous agents are permitted to do. The attack surface moved, and most security teams are still guarding the old one. So the next question is mechanical. Max: How do these agents actually acquire credentials in the first place because there's a supply chain attack that answers that question in painful detail? Yeah, and it involves a workflow automation tool you've definitely got running somewhere in your stack. Derek: slow tool you've definitely got running somewhere in your stack. slow tool you've definitely got running somewhere in your stack. So here's where the attack surface gets concrete. The N8N Kill Chain from January is the clearest example of how these things actually get exploited. Walk me through it. Right, so N8N is No-Code workflow automation. Attackers uploaded a malicious NPM package to the community node registry disguised as a Google Ads integration. According to Endor Labs, the package collected OAuth credentials through a form that looked completely legitimate, then silently exfiltrated them during workflow execution. And the hook was just, it looked real. That's the whole play. The package is tracked under GHSA-77g5-qpc3-x24r. Endor Labs called it a new escalation in supply chain threats. That framing is accurate. Over 100,000 N8N servers were running versions vulnerable to CVE-2026-21858 at the time. 100,000 servers. Yeah, and here's the structural problem Endor Labs flagged: Community nodes in N8N run with the same privilege level as N8N itself. Full file system access, outbound network requests, decrypted API keys and OAuth tokens at runtime. No sandboxing, no audit. So you install what you think is a Google Ads connector and you've handed the attacker the keys to everything that N8N can reach. Everything. CRM tokens, cloud credentials, whatever that Nadyn instance was connected to. Okay, now flip that to Zapier—different structural problem entirely. Yeah, Zapier's the one that keeps me up at night from a governance angle. Because the attack surface isn't a malicious npm package, it's just... the product working as intended. According to upstanding hackers' research on shadow agents in Slack and Teams, an employee can connect a personal Zapier account to the corporate Slack Slack workspace without touching any managed infrastructure whatsoever. No endpoint control, no DLP, nothing. A Zap wired to the finance channel on Monday can be summarizing confidential messages and forwarding them to a personal Gmail by Friday. Your network logs show nothing because the traffic never hits anything you own. It's just a cloud service talking to another cloud service. Exactly, and that's the gap Microsoft saw. VentureBeat reported Agent 365 going GA specifically to address this class of threat, starting with OpenClaw detection on managed Windows devices. Which is a reasonable first step, but that Zapier personal account vector, Agent 365, can't see it either. We'll get to what that product actually covers and where it stops. Right after this. Shipped. So Microsoft finally shipped something. Agent 365 went GA on May 1st, $15 per user per month, or bundled into the new M365 E7 Frontier Suite at $99. According to VentureBeat, it positions itself as a unified control plane built on Defender, Intune, Entra, and Purview. Which tells you something. When the biggest enterprise software company ships a product specifically for this problem, that's not a roadmap slide anymore. Right, and the shadow AI page in the M365 admin center is where it starts, with OpenClaw detection on managed Windows endpoints via Defender and Intune. It finds local agents running on devices you already manage. Okay, so walk me through what that actually covers, because managed Windows devices is doing a lot of work. Work in that sentence. Yeah, it is. The honest version is, if it's on a managed Windows endpoint running Windows and you've got Intune deployed, you've got visibility. Everything else, thinner. BYOD, non-Windows, personal machines, gone. And there's more. According to Microsoft's security blog, context mapping and runtime blocking don't even land until June in public preview. So the MCP-to-identity graph where Defender maps agent relationships to devices... configured MCP servers associated identities and the cloud resources those identities can reach. That's the genuinely interesting piece, but it's not shipping today. Okay, that MCP graph is worth calling out, because an agent wired to an MCP server that has access to, say, Exchange or SharePoint, you can actually see the blast radius before something goes wrong. That's useful. Agreed, when it ships. But here's where the product hits a structural wall, Derek. LayerX put out research showing 71.6% of GenAI tool access in enterprise environments flows through non-corporate accounts. So nearly three quarters of the traffic Agent 365 is supposed to govern is invisible to it entirely. Personal Gmail logging into Claude, personal account on ChatGPT, you can be fully Intune-managed, all policies in place, and that session is completely ungoverned. Entra can't see it. Purview can't touch it. Correct, and AWS Bedrock, Google Cloud Registry Sync, that's in public preview too, so for organizations not deep in the Microsoft stack, enforcement depth is shallow at best. So to be clear, what Agent 365 actually covers today is sanctioned agents on managed Windows endpoints inside the Microsoft identity boundary, and the rest of the world is still on its own. That's the accurate summary. Look, the June context mapping is worth watching. The MCP-to-identity blast radius mapping could genuinely close some gaps from Microsoft-shop Defender. But the life cycle gap is a whole other problem. You can detect an agent on day one. What about the one running on credentials from an employee who left the company six months ago? Yeah, that's where it gets uncomfortable. And that's exactly what the data is showing. So here's the number that nobody in the CSA report is talking about loudly enough: only twenty-one percent of organizations have a formal agent decommissioning process, twenty-one percent! So seventy-nine percent of companies are just accumulating agents with zero structured way to retire them. Noting: The CSA Autonomous but Not Controlled survey called it "retirement" debt. Agents linger past their intended use, holding on to Speaker 3: power. Derek: on to permissions and credentials, long after the person who built them has moved teams or left the company entirely. And here's the mechanism that makes it actually dangerous: deleting an agent from the platform is not the same as revoking its credentials. Right: OAuth grants API keys, those stay active; the agent's gone from the UI, but the token still works. You've got a ghost with a valid badge. A ghost with a valid Max: Why? Derek: badge that no one is looking Speaker 3: for. Derek: looking for because your governance framework was built for static software assets, non-human identities don't have off-boarding checklists. Exactly, and most orgs don't have the tooling to even surface those orphaned grants. You'd have to manually pull the Entra AD connected apps report and start cross-referencing against departed employees. Nobody's doing that weekly. chuckling Nobody's doing it quarterly laughing Fair switching tiers to the Standard side, because this is where it gets It's genuinely interesting. The CSA's CSAI Foundation got authorized as a CVE numbering authority through MITRE, announced at the April 29th Agentic AI Security Summit. That's a real signal. So they can now formally assign CVE IDs for agentic AI vulnerabilities? That's the direction. According to the CSA's own announcement, the scope starts with vulnerabilities in their software. Software tools, but the stated goal is agentic specific vulnerability coordination, and the STAR for AI Catastrophic Risk Annex rolls out June 2026 aligned with NIST AI RMF and ISO/IEC 42001. Max: Hmm. Derek: How long does phase one actually run? June through September 2026. Phase one translates catastrophic risk scenarios into auditable control language: Those auditable controls don't exist as published standards yet, so right now there's no certified audit path for agentic AI catastrophic risk. You're flying without instruments. Correct. Correct, and those zombie agents with live credentials are accumulating the entire time the standards get written, which is exactly why the next step isn't waiting for a framework. It starts with a credential inventory. and we'll get into exactly how to run one. So, the hunt starts Monday morning. First thing, pull your Entra-80 app consent report, filter for OAuth grants created in the last 90 days carrying MailReadWrite, FilesReadWrite, or CRM-level scopes. And you're specifically looking at grants tied to service accounts or created by users who are no longer active. Those are your first hits. Right, and don't stop at Entra. If you're running Google Workspace or Is your Okta alongside it? Speaker 4: Yeah. Derek: Run the same query there. Shadow agents don't care which identity provider you're on. That list you generate? That's your shadow agent's starting inventory. Not a project, not a roadmap item. A list you can hand to whoever owns your automation environments before lunch. Before lunch. I like the ambition. Okay, second query. Second one is the DNS and proxy logs. You're hunting outbound connections to known LLM API endpoints. ION Points (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com) And here's the signal you care about: traffic originating from a CI/CD runner, a build server, or an automation host that has no approved agent workflow attached to it. Exactly. That's not a dev testing something. That's a rogue process calling home to an LLM. And after the LiteLLM supply chain compromise back in March? March, versions 1.82.7 and 1.82.8 on PyPI credential harvester hitting over fifty secret categories; you know this attack service is real. A deadpan popular Python package, downloaded 3.4 million times a day, compromised to steal Cloud credentials and LLM API keys simultaneously. One package, all keys. Speaker 5: Ah, Derek: that's the concentration of risk problem in one sentence. So yeah, DNS logs aren't optional anymore. To tie it together, the CSA data showed Sixty-five% of orgs had an Actual AI Security incident last year. The Governance tools are catching up, but they're not there yet. You cannot wait. So Here's your single action item coming out of this episode. Pull the Entra AD app consent report this week. Filter OAuth grants last Ninety days, MailReadWrite, or FilesReadWrite, or CRM-level scopes. Send that list to the person who owns your Automation environments. That's query One. It takes maybe Twenty-one minutes, and it will almost certainly surface something Nobody knew was there, which is kind of the whole story we've been telling today, isn't it? Yeah, 68% visibility confidence, 82% already found unknown agents. Go find yours. Okay, so if there's one thing that stuck with me from today, it's that 95% number. Nearly every CISO in that survey saying they couldn't contain a compromised agent. Speaker 6: Wow. Derek: That's not a confidence issue. That's a structural one. And the thing Max laid out about OAuth? The agent authenticated correctly, it was authorized, and still nobody was actually driving. That framing really lands when you think about what Monday morning looks like. Right. So pull that Entra ID app consent report, filter for high privilege scopes from the last ninety days, start there. Speaker 7: Hmm. Derek: Don't wait for a policy to catch up to you. That's the move. Derek, this one was good. It really was. All right. New episodes drop every Tuesday. Subscribe wherever you're listening and if this saved you from a bad deployment, drop us a review. We read them. We genuinely do. Thanks for being here everyone. Stay sharp out there.

Shadow Agents at Scale

Shadow Agents at Scale

Show Notes

Frequently Asked Questions

Sources

Transcript

Key Takeaways