As we continue to work on our research about security monitoring use cases, a few interesting questions around the technology implementation and optimization arise. Any threat detection system designed to generate alerts (new “analytics” products such as UEBA tools have been moving away from simple alert generation to using “badness level” indicators – that’s an interesting evolution and I’ll try to write more about that in the future) will have an effectiveness level that indicates how precise it is, in terms of false positives and false negatives. Many people believe that getting those rates to something like “lower than 1%” would be enough, but the truth is that the effectiveness of an alert generation system includes more than just those numbers.
One thing that makes this analysis more complicated than it looks is something known as “base rate fallacy”. There are many interesting examples that illustrate the concept. I’ll reproduce one of those here:
“In a city of 1 million inhabitants let there be 100 terrorists and 999,900 non-terrorists. To simplify the example, it is assumed that all people present in the city are inhabitants. Thus, the base rate probability of a randomly selected inhabitant of the city being a terrorist is 0.0001, and the base rate probability of that same inhabitant being a non-terrorist is 0.9999. In an attempt to catch the terrorists, the city installs an alarm system with a surveillance camera and automatic facial recognition software.
The software has two failure rates of 1%:
- The false negative rate: If the camera scans a terrorist, a bell will ring 99% of the time, and it will fail to ring 1% of the time.
- The false positive rate: If the camera scans a non-terrorist, a bell will not ring 99% of the time, but it will ring 1% of the time.
Suppose now that an inhabitant triggers the alarm. What is the chance that the person is a terrorist? In other words, what is P(T | B), the probability that a terrorist has been detected given the ringing of the bell? Someone making the ‘base rate fallacy’ would infer that there is a 99% chance that the detected person is a terrorist. Although the inference seems to make sense, it is actually bad reasoning, and a calculation below will show that the chances they are a terrorist are actually near 1%, not near 99%.
The fallacy arises from confusing the natures of two different failure rates. The ‘number of non-bells per 100 terrorists’ and the ‘number of non-terrorists per 100 bells’ are unrelated quantities. One does not necessarily equal the other, and they don’t even have to be almost equal. To show this, consider what happens if an identical alarm system were set up in a second city with no terrorists at all. As in the first city, the alarm sounds for 1 out of every 100 non-terrorist inhabitants detected, but unlike in the first city, the alarm never sounds for a terrorist. Therefore 100% of all occasions of the alarm sounding are for non-terrorists, but a false negative rate cannot even be calculated. The ‘number of non-terrorists per 100 bells’ in that city is 100, yet P(T | B) = 0%. There is zero chance that a terrorist has been detected given the ringing of the bell.
Imagine that the city’s entire population of one million people pass in front of the camera. About 99 of the 100 terrorists will trigger the alarm—and so will about 9,999 of the 999,900 non-terrorists. Therefore, about 10,098 people will trigger the alarm, among which about 99 will be terrorists. So, the probability that a person triggering the alarm actually is a terrorist, is only about 99 in 10,098, which is less than 1%, and very, very far below our initial guess of 99%.
The base rate fallacy is so misleading in this example because there are many more non-terrorists than terrorists.”
What makes this extremely important to our security monitoring systems is that almost all of them are analyzing data, such as log events, network connections, files, etc, that have a very low base rate probability of being related to malicious activity. Consider all your web proxy logs, for example. You can find requests there related to malware activity from your users computers, such as C&C traffic. However, the number of those events, comparing to the overall number of requests, is extremely low. For a security system to detect that malicious activity only based on those logs it must have extremely low FP and FN rates in order to be usable by a SOC.
You don’t need to do a full statistical analysis of every detection use case to make use of this concept. Here are three things you can do to avoid being caught in the base rate fallacy:
- Be conservative with the data you send to your detection system, such as your SIEM. Apply the “output driven SIEM” concept and try to ingest only the data you know is relevant for your use cases.
- At the design phase of each use case, do a ballpark estimate of the base rate probability of the condition you are trying to detect. When possible, try to combine more than one condition to leverage the power of Bayesian probability (e.g. “the chance of an individual http request being malicious is 0.0001%, but the chance of a request being malicious given it is to an IP listed in a Threat Intelligence feed is 0.1%”).
- During tuning and optimization of use cases, evaluate each use individually and according to its own parameters. As mentioned before, a 0.01% false positives rates can mean something very different for each use case depending on how much data is being analyzed. Some people try to fix a golden rate or number of acceptable false positives, what could be too strict for one use case and too lax to another.
That was all about base rates; there are other things to take into account when designing and optimizing use cases, such as the importance of the event being detected and the operational processes triggered by the alerts. But that’s something for another post (and, of course, for that research report coming soon!)
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.