What happens when an analyst decides to stress-test Maze with real cloud vulnerabilities, zero instructions, and the camera running? James Berthoty fired up his own AWS environment, plugged in Maze, and let our AI agents go to work. The result? A raw, unfiltered look at how Maze investigates what’s actually exploitable, and tells you exactly why to ignore what isn’t.
This video is an independent analyst’s take on Maze by James Berthoty and Latio. We do partner with Latio for analyst feedback, but they were not paid for this video and received no guidance on what to investigate. All examples come from his own AWS environment and are shared with his permission.
We can tell you all day that Maze changes how security teams (and dev teams) handle vulnerabilities. But there's something more real than us just telling you: watching someone test it themselves, in their own environment, without a script.
James Berthoty, a former cloud security engineer who now runs a leading analyst firm, Latio, recently recorded an unscripted demo of Maze. We didn't tell him what to click. We didn't coach him on which vulnerabilities to investigate. He used his own AWS environment, picked CVEs at random, and walked through what he found.
"I'm Generally Disillusioned with Vulnerability Management"
Security teams are burned out on chasing CVEs that never seem to matter. The dashboards keep screaming. The pressure keeps rising. And the real risks still slip through.
James feels the same way: “If you’ve looked at any of my content over the last couple of years, it’s clear I’m generally disillusioned with vulnerability management.”
The problem is that security teams spend enormous amounts of time hunting for CVEs across their cloud environments, with little ROI from actual patching. The result is a cycle of panic. A CISO sees a report listing 100 critical vulnerabilities. Everyone scrambles. Developers get pressured. And those dynamics create friction between security and engineering that makes everything harder.
Reachability has emerged as a way to cut through the noise, but it's complicated. Every CVE is unique, and figuring out whether something is a true positive or a false positive requires pulling together many different factors, often technical knowledge that’s kept in a developer's mind or that requires deep investigation. That complexity is exactly why James sees AI as potentially useful here, though he's skeptical of how most tools use it: "In many tools it's nothing more than a chatbot, and it's honestly not moving the needle much at all."
Traditional Vulnerability Workflows Are Painful
To set the stage, James walks through AWS Inspector in his test environment. [1:21] He's running a small personal setup: three EC2 instances, a small Kubernetes cluster with four pods. And even in this small environment, he's still looking at thousands of vulnerabilities.
As he puts it: "My CISO's gonna see a report with a hundred critical vulnerabilities on it and go, oh God, we're gonna get attacked. What are we gonna do? Why are we so far behind? Everyone is scrambling and panicking all the time."
He pulls up a critical example: improper access control in an Intel ethernet controller RDMA driver that could allow privilege escalation via network access. This vulnerability is appearing in a pod, which immediately raises questions. How does that pod interact with the networking driver? Could someone even exploit it that way? What's the real likelihood?
These aren't simple questions to answer. And this is just the first critical on his list.
How Maze Investigates a Potential False Positive
As we start to see the problem with doing this in scanners themselves, James pivots to Maze and searches for the same CVE (CVE-2023-25775). [3:34] The platform has already determined it's not exploitable, but what stands out is the reasoning.
Maze's investigation found that the target system runs on an AWS instance type using the AWS Elastic Network Adapter, which doesn't support RDMA capabilities. That's a fundamental architectural limitation. The hardware required for this exploit simply isn't present.
This kind of explanation matters for compliance. As James explains: "The challenge with a lot of reachability tools is, let's say that even there's one that figured out this is not exploitable. I've been through different FedRAMP environments, and you have to give an explanation to the auditor as to why this is then a false positive."
Maze provides that audit trail: the commands the AI agents ran, where it ran them, and the logic connecting the findings to the conclusion. That's what auditors need to feel comfortable marking something off their list.
How Maze Identifies True Positives and Accurately Adjusts Severity
Filtering out noise is only half the job. The other half is finding the fire when there’s so much smoke.
James picks a random vulnerability that Maze escalated (CVE-2019-1010266): a denial of service issue with a base CVSS score of medium that Maze adjusted the severity to high. [5:36] He happens to know this one is real because it's in his intentionally insecure test application, which uses the Lodash library. James shared he actually uses this exact vulnerability to demo reachability detection in his own work, the AI agents start picking up exactly what an expert security engineer would want.
Maze flagged it as high priority because the asset is internet-exposed, a public exploit is available, and the exploit executes with a single request. The analysis confirmed the application is listening, running Node.js, and has the vulnerable version installed and in use.
The fact that he picked this at random is the point. "This was not a planned demo," he notes. "I just picked the first one that was reprioritized up to a high. And it's a true positive."
Beyond the detection, there's practical value for working with engineering. Instead of just telling developers "the tool said it's high, figure it out," you can walk through the steps to reproduce it, explain why it's vulnerable, and discuss compensating controls like restricting internet exposure or adding WAF rules.
The future of AI isn't just another wrapper or chatbot. It's AI embedded into how applications actually work. And James says it best after taking our AI agents for a test drive: "I'm so impressed with how AI can handle how contextual every individual vulnerability is, as opposed to so many other tools that have to rely on general data about the environment."
The Proof is in the AI Agents
Throughout the demo, James keeps coming back to the same theme: defensibility. When you're in a regulated environment or even just working against strict SLAs, you can't just trust a severity score alone. You need to be able to show why you made the decisions you made.
This is especially true when you're pushing back. If you want to deprioritize a critical or argue that something doesn't need to be patched within the standard SLA window, you need evidence. Not a gut feeling, not a vendor's word, but a clear record of what was analyzed and what was found.
As James puts it: "I required audit trails for all of these vulnerability changes that were happening and to actually get a severity score that I can trust, with an audit trail along the way to allow me to just export this and send it off to the auditor on a monthly basis. And it's not some massive challenge to explain what's going on."
That's the difference between tools that apply generic rules and AI that actually investigates your environment to understand what's exploitable, what's not, and how severe the real risks actually are. Maze's AI agents don't just give you a score. They show their work, explain their reasoning, and give you something you can stand behind when it's time to justify your decisions.
When you combine that contextual analysis with clear visibility into that analysis, you get something security and engineering teams have been missing: confidence. Confidence that what you're prioritizing actually matters, and that when you deprioritize something, you can explain exactly why. That's what breaks the panic cycle. Instead of scrambling to justify thousands of criticals or hoping no one checks your exceptions, you have AI agents that investigate, document, and give you the receipts.
See the AI Agents in Action
This brief demo only scratched the surface of how Maze helps teams take control of their vulnerability management program. If you want to try Maze in your own environment or get a deeper dive, schedule a demo with our team at https://mazehq.com/contact.