Blog post

Big Data for Security Realities – Case 4: Big But Narrowly Used Data

By Anton Chuvakin | December 11, 2013 | 0 Comments

securityanalyticsData and Analytics Strategies

Part of my research this quarter focuses on assessing the reality of using big data approaches for security and providing practical, GTP-style recommendations for enterprises. So, what else is real in this technology segment heavily overrun by waves of bull?

One more case that occasionally show up is “Big But Narrowly Used Data.”

The scenario may go like this:

  1. The organization comes across a need to analyze a particular large data set (such as 50-500GB of web proxy logs) for a particular goal (say match accessed URLs to a set of blacklist) [back in the old, “pre-big data” days, I’ve met somebody with a trillion log messages stashed in an old sock somewhere]
  2. An initial attempt to load the data set into a SIEM or a log management tool fails since there is no existing capacity for such a volume and no new capacity is approved.
  3. Some at an organization tries to “brute force”: writes a Perl script to do this on the flat files. Days of impatient waiting ensue 🙂
  4. Suddenly somebody thinks: Hadoop!
  5. The cluster is put together and the data loaded and analysis queries written.
  6. And this is the step where the magic may or may not happen: the organization may decide to use the same approach for other data-intensive security problems and therefore starts on the road of using big data for security…

There you have it!

Related posts on the topic of big data for security:

Comments are closed