March 12, 2026 Product Security

Exploitability: The Fastest Way to Fewer False Positives

SANTIAGO CASTIÑEIRA

If you run anything non-trivial in the cloud, your scanners are probably telling you the same story as everyone else: tens or hundreds of thousands of "critical" vulnerabilities across VMs and containers.

Finding vulnerabilities is no longer the hard part. The hard part is vulnerability prioritization: figuring out which of these are actually exploitable vulnerabilities in your environment today.

At Maze, we built AI agents that investigate vulnerabilities the way your best security engineer would. The only difference? We do it at scale. Our agents research each CVE, pull live context from your cloud, and decide what is actually exploitable and worth fixing.

A lot of that hinges on two ideas:

Exploitability: can this vulnerability really be (technically) exploited in your environment, given how things are configured and deployed?

Reachability: are there realistic paths in code, on the network, or at runtime that let an attacker reach the vulnerable behavior?

We think exploitability is the right approach for solving the problem. But true exploitability should take reachability into account; so we layer in components of reachability where it makes the most sense, and will keep adding more over time, but only where it actually helps agents deliver the right output.

What exploitability actually means and why we care about it

Have you ever investigated a tedious vulnerability marked "super urgent" only to realize it doesn't even apply to you because some config setting is disabled? That's a not exploitable finding. You should never have to look at it.

A lot of the industry treats "exploitable" as "there's an exploit in the wild" or "someone tagged this as exploited in a feed." That's useful context, but it's not how vulnerability prioritization should work.

The question that actually matters is simpler: given how this specific asset runs in your environment, is exploitation technically possible at all?

Every CVE comes with conditions attached to it. Sometimes they're clearly described in the advisory, sometimes they only show up in a PoC or a long blog post, others we have to investigate to determine those conditions by piecing it together from multiple sources:

"This only works if a particular kernel subsystem is enabled"
"The process must run with this flag or capability"
"This feature must be turned on in the configuration"
"The vulnerable code is only reachable in a certain mode or role"
"The attacker needs to come from a specific network location"

Those details determine whether the vulnerability is a real problem on that system or just background noise.

Our AI agents do what your best security engineer would do if they had all the time in the world. They research the vulnerability, work out these prerequisites, and then check your environment to see whether those conditions actually exist.

They do this agentlessly, without installing anything. They pull data from cloud provider APIs, inspect containers and workloads through the control plane, and collect runtime context like which images run where and with which configurations.

From there, they put each CVE and asset pair into one of two buckets. Either at least one key prerequisite is missing, and the finding is simply not exploitable in this environment. Or the technical conditions line up, and an attacker could realistically exploit this asset as it's currently configured.

In real customer environments, most scanner findings aren't actually exploitable vulnerabilities. The CVSS score might be high, but the bug can't actually be exploited the way things are deployed today.

That's the core of what we believe: if a vulnerability can't be exploited in your environment, it shouldn't compete for attention with the ones that can. (This doesn't mean you shouldn't fix it at some point, it just means it's not the top priority.)

A couple of real examples we've seen that help make this clearer:

In 2024, the vulnerability CVE-2024-38541 was found in the Linux kernel. The CVSS vector of this vulnerability has a network attack vector and a 9.8 critical severity. Most tools that do network reachability would highlight this vulnerability and tell you to fix it ASAP.

However, if you are running on AWS Nitro lightweight hypervisor (most AWS workloads nowadays), it uses Advanced Configuration and Power Interface (ACPI) instead of Device Tree for hardware enumeration, which makes this vulnerability not exploitable in your server. You don't need to create a code reachability graph of the whole server to know this either; our agents can figure this out by collecting the right context and some light reasoning.

Another example is vulnerability CVE-2022-28615 on Apache HTTP Server 2.4.53. Again, this vulnerability has a network attack vector and a 9.1 critical severity. However, the agents are great at understanding your Apache HTTP Server configuration.

There are only two situations in which this vulnerability can be exploited. The first is if you are using the Lua module; one of our agents found in a customer environment a commented out line #LoadModule lua_module. The second is if you are using any third-party module. In the same customer environment, our agents analyzed the modules being loaded and none was a third-party. It was a plain vanilla configuration.

Having that information, the AI agents correctly reasoned that this is a false positive. Reachability alone would either not be able to figure this out or would not identify this as a false positive.

Why 'reachability only' falls short

In the last few years, "reachability" has become a popular answer to alert fatigue, but people mean very different things when they say it.

Sometimes it's static code reachability: you scan the code, build call graphs, and see whether any path leads to a vulnerable function. On paper, that looks precise. In reality, production behavior depends on configs, feature flags and deployment modes. Code can be "reachable" in the repository while the paths are dead in production or only used in test setups.

Sometimes it's runtime reachability: you instrument the system, trace execution, and record which functions are actually called. That's closer to reality, but it usually requires sensors, eBPF or similar plumbing in production. It's often intrusive, adds moving parts, carries performance and stability risk, and is hard to roll out cleanly across all environments.

And sometimes "reachability" just means network reachability: the service is on a reachable port, the security groups allow traffic, the load balancer points at it. That's a useful signal, but very surface-level. A service can be reachable on the network and still not exploitable because the kernel is hardened, a process flag changes behavior, a feature is disabled, or strong authentication sits in front of the vulnerable path.

All three can help with vulnerability prioritization, but they share the same blind spot. They answer "is there a path?" and often skip the question that actually decides risk: can this vulnerability be exploited in your environment, with your architecture, configuration, and controls?

You can have a clean reachability story in code and on the network and still have a vulnerability that's impossible to exploit because one prerequisite is missing. And if you go deep on code and runtime reachability, you often pay for it with sensors and ongoing operational work, while still not fully answering that exploitability question.

That's why we treat reachability as supporting evidence for our agents, not the primary signal. (and why we don't make you install any sensors)

The reachability spectrum (and what Maze does today)

Exploitability answers, "Given how things are configured, could someone exploit this?" Reachability answers, "Can anything realistically get to the vulnerable behavior?"

We treat reachability as a spectrum. At one end, you have signals that come straight from APIs and control planes. At the other end, you have techniques that rely on deep hooks into your systems, like sensors with broad visibility and permissions.

Maze focuses on getting a robust and reliable signal while staying as unobtrusive as possible. We only ask for more access when it delivers real additional value, and when there's no other way to get there.

Static code reachability

We look at whether the dependency or library is actually used by the application.

In many modern stacks, especially with interpreted languages and large dependency trees, simple signals already help a lot:

Is the vulnerable package version the one that ends up resolved at runtime?
Is it imported in the application code, or only in tests or tooling?
Is the vulnerable feature a dormant part of a larger library that no one calls?

When we can get these signals without taking over your repos, our agents use them as another input. If a vulnerable component isn't loaded or referenced in any meaningful way, it shouldn't be treated like a hot path on an internet-facing service.

Runtime reachability without eBPF

The next question: is the vulnerable component actually present in the running system?

Even without sensors on your hosts, you can figure out a lot about runtime state:

Which container images are deployed and running
Which versions and packages are present in those images
Whether specific services or processes are up and running
What the process configuration is, and what libraries are linked

This lets us distinguish between:

Vulnerabilities in code that's actually running
Vulnerabilities that only exist in unused images, old snapshots, unlinked libraries or tooling containers

In practice, we see plenty of findings where the vulnerable package is technically "in the environment" but never part of any running workload. That's not the same risk.

Network reachability

The question is simple: can an attacker even talk to the vulnerable thing?

We already get a lot of this from cloud and platform APIs:

Whether an asset is exposed to the public internet or only reachable inside the network
Security groups, firewall rules, WAF, and load balancer configuration
Which services are allowed to talk to which others

Because this data already lives in control planes, we can gather and interpret it without installing sensors.

How it works in practice: Exploitability with reachability

Here's what this looks like when you put it all together.

A reachability-only mindset asks:

"Does any code path lead to this vulnerable function?"

"Is the vulnerable method called in production?"

"Can the vulnerable service be reached from the internet?"

If the answer is "yes" to any of these, the finding is often treated as important, without digging into whether the exploit could actually work in your environment as deployed.

The Maze approach starts from a different question:

"Could an attacker actually exploit this vulnerability, on this asset, in this environment?"

Our AI agents investigate each CVE and asset pair with that in mind. Exploitability is the first gate: if the technical prerequisites for exploitation aren't present in your environment, the finding is not exploitable and we move on.

When the prerequisites are there, reachability signals act as multipliers:

Network exposure tells us who can realistically reach the asset
Runtime and image-level data shows what's actually live
Import and usage hints reveal whether a dependency is really in play

And yes (did we mention this already?), no sensors to deploy. No changes to your production images or VMs. No need to hand over full repo access just to cut down false positives.

Put together, this lets us safely throw away the big slice of your backlog that's both technically impossible to exploit in your environment and not meaningfully reachable on the network or at runtime.

What's left is a much smaller set of truly exploitable vulnerabilities that need your team's attention.

From exploitability to risk

Exploitability already gets you much closer to reality than the raw CVE lists that most vulnerability management programs rely on. But what you really care about is risk: the impact and likelihood of a specific vulnerability being exploited on a specific asset, in your environment.

At Maze, we take it back to the basics with the CIA model, based on what actually runs in your cloud, not in theory. Here's a taste of what that looks like in practice:

Confidentiality: which data can the vulnerable workload actually see, and how is it classified? An ECS task handling customer PII is not the same as a sidecar scraping metrics.

Integrity: what could an attacker change if they walk through this vulnerability? Can this EC2 instance write to a critical S3 bucket? The question isn't just "can they run code?" but "on what data and systems would that code have write access?"

Availability: how resilient is the service, really? A replicated EKS deployment with health checks and multiple pods is not the same as a one-off batch job. When we say "this vulnerability could take the service down," that assessment already accounts for things like redundancy, autoscaling, and failover.

On the likelihood side, we're just as context-heavy.

Exploit maturity matters: is there a stable, public exploit, is this being used in the wild, or are we still at the "theoretical PoC on GitHub" stage?

The attack path matters: is this sitting on an internet-facing entry point or three hops deep behind internal services and strong auth?

Compensating controls matter too. They can turn "exploitable in principle" into "only exploitable if several other things go wrong at the same time."

Our agents pull all of this together for each CVE and asset. They don't just stop at "exploitable and reachable." They reason about the impact if someone used this path, and how likely it is that an attacker could actually get there.

That's where we're heading: not more graphs, but clearer answers to three simple questions for each finding in your backlog:

What could break or be exposed on this specific asset?
How likely is it that someone could pull that off here?
Given the rest of your environment, is this really worth fixing before everything else?

Exploitability is the foundation of real vulnerability prioritization. Impact and likelihood, in context, turn it into real risk.

FAQ

What's the difference between "exploitable" and "exploit available"?

These get conflated constantly, but they're very different things as we've defined here. "Exploit available" means someone has published a working exploit or it's been observed in the wild. That's what feeds like KEV and scores like EPSS track. "Exploitable" means this vulnerability can actually be exploited on this asset, in your environment, given your specific configuration and controls. A vulnerability can have a public exploit and still not be exploitable in your environment because a prerequisite is missing. And a vulnerability with no known public exploit can still be exploitable if the conditions line up.

How is exploitability different from CVSS?

CVSS gives you a generic severity score based on the characteristics of the vulnerability itself. It doesn't know anything about your environment. A 9.8 CVSS vulnerability might be completely not exploitable in your infrastructure because of how things are configured, what's enabled, or what controls are in place. Exploitability takes that environment context into account and asks whether exploitation is actually possible here.

Does exploitability replace EPSS or KEV?

No. They answer different questions. EPSS predicts the likelihood that a vulnerability will be exploited somewhere in the wild over the next 30 days. KEV tells you which vulnerabilities have confirmed active exploitation. Exploitability asks whether a specific vulnerability can be exploited on a specific asset in your environment. They're complementary: EPSS and KEV give you useful threat intelligence, exploitability gives you ground truth about your actual risk.

Do I still need to fix vulnerabilities that aren't exploitable?

Probably, eventually. In fact, we suggest you do to avoid building up tech debt. Not exploitable doesn't mean "ignore forever." It means this isn't your top priority right now. The conditions that make something not exploitable today could change: a config update, a new deployment, a removed control. But when you're staring at thousands of findings, knowing which are actually exploitable vulnerabilities right now lets you focus your team's time on what matters. And knowing things can change, Maze continually checks if the conditions have changed, making a vulnerability exploitable.

What's the difference between exploitability and reachability?

Reachability asks "is there a path to the vulnerable code, function, or service?" Exploitability asks "given everything about this environment, could an attacker actually exploit this vulnerability?" You can have a fully reachable service with a vulnerability that's impossible to exploit because, for example, a kernel setting is disabled or a feature isn't turned on. Reachability is one useful input into an exploitability assessment, but it doesn't answer the full question on its own.

All articles All articles

May 29, 2026 Security

Patching: the dirty secrets and how to fix them