One common thread seen among those who actually do use big data tools and related analytic approaches for security is their analytic mindset. Not tools. Not algorithms. Not hoards of data scientists. Not methods, and not even specific approaches – but a mindset.
How do we define this mindset and turn it into something teachable to other organizations?
Let’s start here:
- Are you into consuming security products or exploring data? Do you feel that you need a security appliance for everything?
- Do you say “give me the data” or “give me out-of-the-box content, canned rules, signatures”?
- Do you just want to be shown “what you need to know” or are you willing to figure what you need to know from the data you have?
- Would you rather learn “what your data is trying to tell you” or “what latest stuff the vendors have on sale”?
At this point, if you just want “a box”, the path of big data analytics is not for you. Analytic mindset seems to determine the success of a big data initiative for security more than anything else. Those organizations that succeed with using big data for security are all subscribers to this view. They all state that for the foreseeable future, there will be no “boxed security big data analytics” products (except for some narrow and specific problems solved by specific tools).
Along the same line, somebody asked me one of those days “Do I need to toss out my SIEM and buy “a big data product”? – NO, SILLY!!! You need to try using your SIEM to actually analyze the data inside it…. If you analyze the data inside your SIEM to its maximum potential, then you may need to look beyond that into other tools and approaches. But start from data exploration, not from tool replacement!
Therefore, the best analytics “starter pack” is the one you can do on the data and tools you have. If you have RDBMS full of logs, flows or context data – start there. Leverage the data you have collected to make better decisions; use traditional BI tools on that database to see what emerges (some of the current ‘big data for security’ champions started like that). In fact, if all you have is Excel and bunch of exported reports – well, start exploring there!
The evolution then continues like this: ask questions of the data you have -> get a useful answer –> become more data driven –> gather more data –> ask more useful questions.
Organization then start to naturally “think data first”: new threat pops up? Let’s go into our data and see what is up, then create new analytic approaches to detect and investigate it – rather than start whining “what tool do I buy next?” No amount of Hadoop will give you big data analytics without a mindset. As I found out, this mindset and data curiosity is most important; by the way, mindset importance is also well-established for doing indicator hunting and anomaly detection, such as using network forensics and ETDR tools (also see Alert-driven vs Exploration-driven Security Analysis).
So, go and build your own data analytic discipline! Build analytic-centric and data-centric mindset – rather than buy or download any particular big data technology. Start data driven – not tool-driven (and, yes, Hadoop is a tool too – and the one often hard to implement, operate and utilize, especially in the absence of clarity of purpose or your goals). You cannot solve a mindset problem by buying technology; you need a mindset for leveraging data differently.
The only path is to shift the thinking, learn to be data-centric and data-driven and then solve problems that call for bigger data. Such culture change has to happen for the big data approaches to become pervasive across the industry. And yes, this includes willingness to explore, follow leads, and occasionally arrive at dead ends and algorithms that don’t work.
In fact, most of my questions about the particular algorithms aimed at those few (REALLY few!) organization that do advanced analytics on large-scale security data resulted in no single list of “top useful algorithms.” Machine learning (ML), Bayesian, clustering, various data mining and text mining methods were mentioned, but none were highlighted as “must use.” What was a must? Again, it was a mindset and willingness to dip into a toolbox of algorithms to throw at data…
Finally, some quick tips:
- Got a SIEM? Go beyond vendor reports, run those queries direct to backend, extract and visualize.
- Got a little other data relevant to security? Try open source mining tools, write scripts to analyze and profile data, and look at the data and see what it is trying to tell you …
To summarize, while conceptually, security is becoming a big data analytics problem, practically, it won’t become that for you if you keep investing in prevention and buying boxes.
There you have it! Now, GO EXPLORE YOUR DATA!!!
Related posts on the topic of big data for security:
- Big Data for Security Realities – Case 3: Elastic Search or Similar
- Big Data for Security Realities – Case 2 Variety Explosion
- Big Data for Security Realities: Case 1: Too Much Volume To Store aka “Big Data Collection”
- Big Data Analytics for Security: Having a Goal + Exploring
- More On Big Data Security Analytics Readiness
- Broadening Big Data Definition Leads to Security Idiotics!
- Next Research Project: From Big Data Analytics to … Patching
- 9 Reasons Why Building A Big Data Security Analytics Tool Is Like Building a Flying Car
- “Big Analytics” for Security: A Harbinger or An Outlier?
- All posts tagged big data
Read Complimentary Relevant Research
Organizing for Big Data Through Better Process and Governance
With big data past the Peak of Inflated Expectations on the Hype Cycle, organizations are addressing next-level challenges and asking,...
View Relevant Webinars
Hadoop and Spark: Understanding Open Source Opportunities and Risks
As companies build foundational data and analytics infrastructure with Spark and Hadoop, the market continues to shift and evolve in...
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.