Gartner Blog Network

Big Data Analytics for Security: Having a Goal + Exploring

by Anton Chuvakin  |  October 3, 2013  |  2 Comments

“There are two, seemingly conflicting, views on how to formulate a hypothesis for big data analysis: via data exploration or by having a goal. Exploration within the frame of having a goal is an expected work pattern with big data.” (source: “No Data Scientist Is an Island in the Ocean of Big Data”, another excellent GTP piece on big data)

This thinking, as applied to information security, shows a drastic departure from “which reports do I run?” and “how do I tweak my correlation rules?” thinking that dominates the land of SIEM, the most analytics-heavy security product category today.


  1. Are you ready to explore your data?
  2. Do you have clear goals for delving into the data?

If the answer is “no, I want to be told what matters for me” and “no, I just want it all in Hadoop”, sadly, big data approaches are likely not for you. Well, you can try it, but prepare to be sorely disappointed after spending a lot of money and time.

Let’s tackle these one by one:

  • We touched on the subject of security data exploration when talking about NFT and ETDR tools (see “Use Cases for Network Forensics Tools”, “Endpoint Visibility Tool Use Cases” and “Alert-driven vs Exploration-driven Security Analysis”). Indeed, organizations are starting to explore their log stores, packet stores and endpoint traces stores in order to discover malware and other indicators of attackers’ activity. Exploring unstructured big data piles, however, is much harder and may involve text analytics, hardcore statistical methods and other esoteric disciplines, much removed from the traditional security skill sets (it is not all about the keyword search, you know).
  • Regarding the goals, the same research reminds us that “analysis designing consists of formulating viable business [security, in this case] and analytical hypotheses” and then iterating using the available data. Are you collecting just in case? Because “Hadoop is cheap”? Start thinking clear goals and then testing them on data. Can I find out who touches my sensitive applications maliciously? Is there any way to mine my web logs to find early recon? Do I have traces of phishing “backscatter” and how do I find them?

And of course, “having a goal and opportunistic exploration are not mutually exclusive. Exploration eventually leads to formulating concrete hypotheses through multiple iterations of honing a goal.” (same document)

Finally, “there is a common illusion that hiring a data scientist solves all big data needs.” Agreed, that is an illusion! However, keep the opposite in mind as well: NOT hiring a data scientist probably solves NONE of the big data needs (after all, you miss 100% of the shots you don’t take….)

Related posts on the topic of big data for security:

Additional Resources

100 Data and Analytics Predictions Through 2024

Gartner’s annual predictions disclose the varied importance of data and analytics across an ever-widening range of business and IT initiatives. Data and analytics leaders must consider these strategic planning assumptions for enhancing their vision and plans.

Read Free Gartner Research

Category: analytics  data-and-analytics-strategies  security  

Anton Chuvakin
Research VP and Distinguished Analyst
8 years with Gartner
19 years IT industry

Anton Chuvakin is a Research VP and Distinguished Analyst at Gartner's GTP Security and Risk Management group. Before Mr. Chuvakin joined Gartner, his job responsibilities included security product management, evangelist… Read Full Bio

Thoughts on Big Data Analytics for Security: Having a Goal + Exploring

  1. web page says:

    Hello Dear, are you really visiting this site regularly, if
    so after that you will definitely obtain good know-how.

    Feel free to visit my web-site … web page

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.