Merv Adrian

A member of the Gartner Blog Network

Merv Adrian
Research VP
4 years with Gartner
37 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio

Coverage Areas:

Data Security for Hadoop – Add-on Choices Proliferating

by Merv Adrian  |  February 23, 2014  |  5 Comments

In my post about the BYOH market last October, I noted that increasing numbers of existing players are connecting their offerings to Apache Hadoop, even as upstarts enter their markets with a singular focus. And last month, I pointed out that Nick Heudecker and I detected a surprising lack of concern about security in a recent Hadoop webinar. Clearly, these two topics have an important intersection – both Hadoop specialists (including distribution vendors) and existing security vendors will need to expand their efforts to drive awareness if they are to capture an opportunity that is clearly going begging today. Security for big data will be a key issue in 2014 and beyond.

Other analysts at Gartner have tracked many of these products, and in my own followup I’ve been catching up on the work of Joseph Feiman and Brian Lowans, among others. Their Magic Quadrant for Data Masking, published in December, offers useful discussion of that capability (both static and dynamic) and which existing players have already added Hadoop support. Axis Technology’s DMSuite, Dataguise (who partners with Compuware), IBM InfoSphere Optim Data Privacy and InfoSphere Guardium Data Activity Monitor, Informatica Dynamic Data Masking and Persistent Data Masking, and Voltage SecureData Enterprise are all mentioned in the MQ.

Screen Shot 2014-02-23 at 11.26.23 AM

There are other offerings, of course – for example Feiman and Lowans note that masking of big data is available for the Oracle Big Data Appliance with its installed Cloudera distribution, but added that it requires the use of Oracle consulting services, or the services of Oracle’s numerous service partners. Similarly, there are several emerging Hadoop focused firms I’ve mentioned elsewhere and will cover in an upcoming piece of Gartner research I’m doing with Neil MacDonald. With RSA coming up this week (unfortunately, I can’t attend), I expect to see more heat – and perhaps light as well – on the issue ahead.



Category: Apache Big Data Cloudera Dataguise Gartner Hadoop IBM Magic Quadrant Oracle Security     Tags: , , , , , , , , , , , , ,

5 responses so far ↓

  • 1 Data Security for Hadoop – Add-on Choices Proliferating | Merv Adrian's IT Market Strategy   February 23, 2014 at 7:19 pm

    […] –more– […]

  • 2 Data Security for Hadoop – Add-on Choices Proliferating | All that Cuteness   February 23, 2014 at 7:23 pm

    […] By Merv Adrian […]

  • 3 Data Security for Hadoop – Add-on Choices Proliferating | Future Focus Infotech   February 24, 2014 at 9:19 am

    […] Data Security for Hadoop – Add-on Choices Proliferating […]

  • 4 Ron Indeck   February 25, 2014 at 6:44 pm

    Thank you for your timely article.

    We agree that data governance for Hadoop is exceptionally important and an often overlooked requirement. As we review customer requirements for large-scale, real-time analytics projects, they often express the concern that the security mandatory for good enterprise compliance and governance is missing within Hadoop. This includes customers exploring public, hybrid, and private cloud deployments.

    One approach gaining momentum is to obfuscate data before landing in Hadoop. A workable solution for most practical use cases consists of securing data in flight – masking sensitive fields (whether PII/PHI/PCI or other precious data) and encrypting large blocks of critically sensitive data – before entering the distributed cluster. This can be accomplished effectively using a stream processing engine. For these applications this solution can perform NIST format-preserving masking/encryption at 10 million fields per second and bulk encryption at several GB/s. This enables users to access data in near real-time for advanced analytics, business advantage, and operational effectiveness without risking private customer or confidential company data.

  • 5 Merv Adrian   February 27, 2014 at 6:56 pm

    Thanks, Ron. There are many ways to approach the issue, and to be truly comprehensive, organizations should be thinking about almost all of them to ensure there are no weak links. Look forward to our next chat about what Velocidata is dong.