by Anton Chuvakin | August 29, 2018 | Comments Off on More on Security Data Lakes – And FAIL!
Naturally, all of you have read my famous “Why Your Security Data Lake Project Will FAIL!” [note: Anton’s ego wrote this line :-)]
Today I read a great Gartner note on data lake failures in general (“How to Avoid Data Lake Failures” [Gartner access required]). Thus, I wanted to share a few bits that, in my experience, are VERY relevant to security data lake efforts I’ve seen in recent years. So:
- “Proponents of data lakes often exaggerate their benefits by promoting them as enterprisewide solutions to all data and analytics problems.” – indeed, we’ve seen the exact same thing with security data lakes! Of course, then the reality hits: you build a huge pile of dirty data poo – and nothing else …
- “Data lakes are rarely started with a definite goal in mind, but rather with nebulous aspirations […]” – same is often seen with security data lakes.
- “Avoid confusing a data lake implementation with a data and analytics strategy. A data lake is just infrastructure […]” – this is pretty much what I said in the post.
- “The popular view is that a data lake will be the one destination for all the data in their enterprise and the optimal platform for all their analytics.” – the paper later explains that, generally speaking, this is very false, becauses it rests on 3 false assumptions. This is false even if scoped down to all security relevant data.
- The paper later describes several exciting FAIL scenarios, all of which I’ve seen with security data lakes. For example, “single version of the truth” as a failure scenario often means a single version of raw unusable data that nobody wants and nobody knows how to use.
- Another “failway” is “Data Lake Is My Data and Analytics Strategy” with its juicy “ego-driven perspective on data lakes: they see them as means by which to be viewed as thought leaders […]” that result in all the useless data, none of the insight situation.
- Yet another FAIL comes from “Infinite Data Lake” confusion. Imagine lots of useless data … now imagine a lot of useless data a year later. Two years. Five years. What is worse than unusable data? OLD unusable data that has even less context. NOW: useless. TWO YEARS LATER: that much more useless at huge hardware cost!
- Finally, they close with: “The goal of gathering all data in one location was never truly achieved in the data warehousing world. It’s unlikely to be achieved in the data lake world, either […]”
- Why Your Security Data Lake Project Will FAIL!
- Sad Hilarity of Predictive Analytics in Security?
- On Unknown Operational Effectiveness of Security Analytics Tooling
- Now That We Have All That Data What Do We Do, Revisited
- Killed by AI Much? A Rise of Non-deterministic Security!
- Security Analytics Lessons Learned — and Ignored!
- Why No Security Analytics Market? <- important read for VCs and investors! Works in 2018 too, mostly.
- More On Big Data Security Analytics Readiness
- 9 Reasons Why Building A Big Data Security Analytics Tool Is Like Building a Flying Car
- Big Analytics” for Security: A Harbinger or An Outlier?
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.