Gartner Blog Network


Broadening Big Data Definition Leads to Security Idiotics!

by Anton Chuvakin  |  September 18, 2013  |  10 Comments

One of the mysteries I am planning to explore in my research on using big data approaches for security is this: why so many surveys and media reports seem to show (no links here!) that 20%-40% of organizations utilize big data approaches for security today, while in reality this is not the case – by a long shot.

Let’s see. Here is the canonical definition of “big data”:

“Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. (source)

Notice something interesting: the 3Vs are described as volume, velocity AND variety! If you have a small pile of variable data, say, 10Mb of it, we are definitely not in a big data realm. A huge RDBMS of structured (not varied) records is not big data either. The idea is AND, not OR!

On the other hand, see how some other people define big data and “big data tools”:

big-data-idiotics-excel

Sorry, guys, but this is SECURITY IDIOTICS, not security analytics. A reality of using big data for security is much more rare – and much more precious….

Related posts:

Category: analytics  big-data  security  

Anton Chuvakin
Research VP and Distinguished Analyst
5+ years with Gartner
17 years IT industry

Anton Chuvakin is a Research VP and Distinguished Analyst at Gartner's GTP Security and Risk Management group. Before Mr. Chuvakin joined Gartner, his job responsibilities included security product management, evangelist… Read Full Bio


Thoughts on Broadening Big Data Definition Leads to Security Idiotics!


  1. […] Broadening Big Data Definition Leads to Security Idiotics! […]

  2. Rob Bird says:

    I could not possibly agree more! Part of the problem is how many completely unrelated technologies have abused Big Data terminology in their marketing. Excel is apparently Big Data. SIEMs are apparently Big Data. Log Management is apparently Big Data. Anything involving a backend cloud feed is apparently Big Data.

    Velocity is the least understood, or applied, aspect of those V’s in security. It implies *streaming* data, and a necessity for *streaming* analytics, not evolution of content, or some other marketing nonsense.

    I’d go so far as to say that if it’s not using machine learning pervasively, it’s not Big Data analytics. That’s over-restrictive, for sure, but it cuts out a bunch of Excel / R / Queries & Searches / Manual Manipulation that just doesn’t belong in the category!

  3. I couldn’t make out all the fine print, but that chart may as well add Google Search if they are going to include Excel.

  4. Matthew Gardiner says:

    Like most things, what is “Big Data” is in the eyes of the beholder. I always go back to what one is trying to accomplish, not the tools used. In the context of Anton’s blog, what organization’s are assumedly trying to accomplish in the category is the distillation of enterprise scale flows of data (log, events, network traffic, threat intelligence, vulnerability information, identity information etc…) from which security relevant anomalies can be detected, prioritized, investigated, understood, and remediated before the bad guys are successful. Our argument at RSA is that this isn’t possible with traditional SIEM tools, let alone Excel.

  5. @matt

    >what is “Big Data” is in the eyes of the beholder

    Dude, sadly, you just proclaimed yourself to be part of the problem.

    Try these for size:
    – “what is a car is in the eyes of the beholder”
    – “what is black and white is in the eyes of the beholder”
    – “what is a program is in the eyes of the beholder”

    IT terms should have specific definitions – and so does big data. I can call your golf cart a car, but it won’t magically become that since it is “in the eye of the beholder”…

  6. @gene

    Based on their definition, Google is definitely big data :-)

  7. Matt says:

    “IT terms should have specific definitions – and so does big data”

    Anton – that should be ‘and so should big data’. In the meantime the vendors and marketeers are free to run their grubby little hands all over it as per Rob’s comment at the top. BD went mainstream with TV documentaries (BBC’s Horizon) and adverts on TV and the marketeers are having a field day. Substitute the word Big with Medium or Modest and you can see why they won’t let go.

    Couldn’t agree more though, especially ‘Excel’….

  8. @matt

    Yes, of course you are right: and so SHOULD big data

    Hopefully we’d arrive at a precise definition soon – and will leave the media arguing about their fuzzy definition.

  9. […] “Data sampling is dead, and the success of a big data analytics initiative cannot be shown with a small-scale pilot.” (and, no, Excel is not a big data tool) […]

  10. Melodi Kern says:

    I do not even know how I ended up here, but I thought this post was great. I don’t know who you are but certainly you are going to a famous blogger if you aren’t already 😉 Cheers!|



Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.