Blog post

Threat Simulation – How real does it have to be?

By Augusto Barros | January 09, 2018 | 3 Comments

pentest and red teams

We are starting our research on “Testing Security”. So far we’ve been working with a fairly broad scope, as Anton’s post on the topic explained. One of the things we are looking at is the group of tools that has been called “breach and attack simulation tools”.

Tools that automate exploitation have been around for years; we can mention things like Metasploit, Core Impact, CANVAS and others. Those are tools used by pentesters so they don’t need to rewrite their exploits for each specific condition they find during their test. So what’s different in the new wave of tools?

The idea of using these tools is to have a consistent way to continuously test your controls, from prevention to detection (and even response). They are not focused on making exploitation easier, but to run an entire intrusion scenario, end to end, to check how the controls in the attacked environment react to each step. They go beyond exploitation and include automation of the many steps in the attack chain, including command and control, lateral movement, resource access and exfiltration. They also add a layer of reporting and visualization that allows the users to see how each attack  is performed and what the tool was (or was not) able to accomplish.

We are just starting to talk to some of the vendors in this space, but I noticed there’s one point they seem to argue about: how much real should these tests be? Some of the vendors in this space noticed there is a strong resistance from many organizations in running automated exploitation tools in their production environments, so they built their tools to only simulate the more aggressive steps of the attacks. Some of these tools even take the approach of “assuming compromise”, bypassing the exploitation phase and focusing on the later stages of the attack chain.

It is an interesting strategy to provide something acceptable to the more conservative organizations, but there are some limitations that come with that approach. In fact, I see two:

First, many prevention and detection technologies are focused on the exploitation actions. If there are no exploitation actions, they just won’t be tested. So, if the tool replicates a “lateral movement scenario” on an endpoint using a arbitrary fake credential that mimics the outcome of successfully running mimikatz, no tool or approach that looks for the signs (or prevents) of that attack technique being used (like this) will be tested. If the organization uses deception breadcrumbs, for example, they wouldn’t be touched, so there’s no way to check if they would actually be extracted and used during an attack. Same thing for monitoring signs of the exploit or even preventing them from working using exploitation mitigation technologies. So, the testing scenarios would be, in a certain way, incomplete.

Second, the fact that exploitation is not necessarily something that happens only in the beginning of the attack chain. It is often used as one of the first steps to get code running into the target environment, but many exploitation actions could come later as part of the attack chain for privilege elevation, lateral movement and resource access. So, assuming that exploitation has only a small role, at the beginning of the attack chain, is a very risky approach when you are looking for what needs to be tested in the entire control set.

Looking at these two points in isolation suggests that breach and attack simulation tools should perform real exploitation to properly test the existing controls. But apart from the concerns of disrupting production systems, there are other challenges with incorporating exploits in the automated tests. The vendor or the organization using the systems now needs to the ability to incorporate new exploits as new vulnerabilities come up, check if each one of those are safe or if they could damage the ability of the real systems to protect themselves after the test is completed (some exploits disable security controls permanently, so using them during a test could actually reduce the security of the environment). The approach of avoiding exploitation eliminates those concerns.

If both approaches are valid, it is important to the organization to understand the limitations of the tests and what still needs to be tested manually or through alternative means (such as good and old checklist?). This also brings another question we should look at during this research: how to integrate the findings of all these testing approaches to provide a unique view of the state of the security controls? That’s something for another post.

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Leave a Comment


  • This market is definitely defined by the ability to run it as often as you want in a production network without fear of crashing systems and causing outages.

    Here’s my perspective on the market though. I think of BAS as “second-stage” products. There’s a good reason that it’s okay to skip exploits: they’re rarely needed. As a pentester, I almost always found it easier to steal credentials or some other form of access control token. Real attackers do it and it attracts a LOT less attention. Who alerts on a valid user logging on?

    So, by assuming our systems will get compromised, either by exploit, stolen creds or some other attack vector, we focus on detecting their activities and movement in this next stage of the attack. Had we ever focused on detective controls, this next stage wouldn’t be such a safe haven for attackers. It is the entire reason dwell time exists as a metric!

    Once in, most attackers can be as noisy as they want and no one notices them. In addition to dwell time, this is why, even today, 3rd parties are still more likely to detect the attack than the victim organization.

  • Albert says:

    Adrian, you are quite right!

    I understand how important it is to be able to detect exploitation attempts, keep vulnerability management programs up to speed and so on. However, as you mention, detecting threat-like activities and anomalies is key to trigger detection (or at the very least reduce detection time).

    Your first sentence is right too. And that’s unfortunate because availability should be part of a security program. I’d like to see more companies adopting methodologies such as those provided by tools like Netflix’s Chaos Monkey.

    Interesting post!