Gartner Blog Network

Better Data or Better Algorithms?

by Anton Chuvakin  |  October 4, 2016  |  1 Comment

An eternal question of this big data age is: what to choose, BETTER DATA or BETTER ALGORITHMS?

So far, most [but not all!] of the deception users we interacted with seem to be using their deception tools as “a better IDS.” Hence our discussion of the business case for deception (here and here) was centered on detecting threats.

Naturally, there are many detection tool categories (SIEM, UEBA / UBA, EDR, NTA, and plenty of other yet-unnamed ones) that promise exactly that – better threat detection and/or detection of “better” threats!

During one of the recent “deception calls” it dawned on us what separates “deception as detection” from those other tools:

  • DECEPTION TOOLS rely on “better source data”, such as attacker’s authentication logs, attacker’s traffic, files that the attacker touched, etc
  • MOST OTHER TOOLS rely on “better data analysis” of data such as all logins, all traffic or all files touched, etc.

So, can we say which one is better? Until we can have a cage match of a deception vendor with, say, a UEBA vendor, we probably won’t know for sure. The largest enterprises (the proverbial “security 1%-ers”) will “buy one of each” (as usual) and the smaller ones will wait for a product that combines both featuresets with a firewall 🙂

For example, one of the interviewees outlined an elegant scenario where a deception tool and a UBA / UEBA tool are used together. We hesitate to say that this is the future for everybody, but it was an interesting example of the “strength-based” approach to tools…

Still, “detection by better source data” has unique appeal to people who are just not willing to “explore all data.” Our contacts report “low friction”, better signal/noise, low/no “false positives” and low operational burden for deception tools [used for detection].

Hence, unlike the “all data + smart algorithms” that may be philosophically superior (since looking at ALL data will theoretically alllow you to detect all threats, but … can we really have ALL data?), some organizations are choosing “decoy-sourced data” and seem happy with their decisions…

Our related blog posts on deception:

Additional Resources

100 Data and Analytics Predictions Through 2024

Gartner’s annual predictions disclose the varied importance of data and analytics across an ever-widening range of business and IT initiatives. Data and analytics leaders must consider these strategic planning assumptions for enhancing their vision and plans.

Read Free Gartner Research

Category: analytics  data-and-analytics-strategies  deception  security  

Anton Chuvakin
Research VP and Distinguished Analyst
8 years with Gartner
19 years IT industry

Anton Chuvakin is a Research VP and Distinguished Analyst at Gartner's GTP Security and Risk Management group. Before Mr. Chuvakin joined Gartner, his job responsibilities included security product management, evangelist… Read Full Bio

Thoughts on Better Data or Better Algorithms?

  1. Andre Gironda says:

    You’re talking about the difference between threat intelligence and friendly intelligence; between capabilities in counter deception and counterintelligence versus run-of-the mill data-science techniques.

    Data science is important because it reduces the size of the haystack while still allowing for a ridiculously-sized fast-rate growing haystack. Yet most of the UEBA, UBA, NTA, EDR, and SIEM platforms utilize either non-complex relationships (N.B., I’ve yet to see a cyber intelligence platform outside of the free, open-source BloodHound platform from the Veris Group Adaptive Threat Division utilize a graph db) or, worse, a subset of machine learning and statistical inference techniques such as confidence intervals.

    When comparing controls, you will likely want to first model cyber risk (i.e., OpenGroup FAIR) correctly. Deception systems are controls in the avoidance or deterrence categoricals, not typical infosec vulnerability or response controls. Thus, they have an entirely different effect on the variables and boundaries that the model subsists when compared to UBA, UEBA, NTA, EDR, and SIEM — which are mostly focused on those also-critical responsive controls. Thus, an org would want to utilize EP curves and risk-tolerance curves (ala How to Measure Anything in Cyber Security Risk) with comparatives inside each control boundary, i.e., for Deception Systems these would be compared against other platforms in the avoidance and deterrence control sets (e.g., anti-fraud systems, anti- cred stuffing, responsive IPS, reduction, segmentation, segregation, anti-tampering, et al) while pitting the responsive controls against each other to determine how to prioritize spend, resource allocation, project-delivery timelines, etc.

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.