Gartner Blog Network

Data Security for Hadoop – Add-on Choices Proliferating

by Merv Adrian  |  February 23, 2014  |  5 Comments

In my post about the BYOH market last October, I noted that increasing numbers of existing players are connecting their offerings to Apache Hadoop, even as upstarts enter their markets with a singular focus. And last month, I pointed out that Nick Heudecker and I detected a surprising lack of concern about security in a recent Hadoop webinar. Clearly, these two topics have an important intersection – both Hadoop specialists (including distribution vendors) and existing security vendors will need to expand their efforts to drive awareness if they are to capture an opportunity that is clearly going begging today. Security for big data will be a key issue in 2014 and beyond.

Other analysts at Gartner have tracked many of these products, and in my own followup I’ve been catching up on the work of Joseph Feiman and Brian Lowans, among others. Their Magic Quadrant for Data Masking, published in December, offers useful discussion of that capability (both static and dynamic) and which existing players have already added Hadoop support. Axis Technology’s DMSuite, Dataguise (who partners with Compuware), IBM InfoSphere Optim Data Privacy and InfoSphere Guardium Data Activity Monitor, Informatica Dynamic Data Masking and Persistent Data Masking, and Voltage SecureData Enterprise are all mentioned in the MQ.

Screen Shot 2014-02-23 at 11.26.23 AM

There are other offerings, of course – for example Feiman and Lowans note that masking of big data is available for the Oracle Big Data Appliance with its installed Cloudera distribution, but added that it requires the use of Oracle consulting services, or the services of Oracle’s numerous service partners. Similarly, there are several emerging Hadoop focused firms I’ve mentioned elsewhere and will cover in an upcoming piece of Gartner research I’m doing with Neil MacDonald. With RSA coming up this week (unfortunately, I can’t attend), I expect to see more heat – and perhaps light as well – on the issue ahead.


Additional Resources

Predicts 2019: Data and Analytics Strategy

Data and analytics are the key accelerants of digitalization, transformation and “ContinuousNext” efforts. As a result, data and analytics leaders will be counted upon to affect corporate strategy and value, change management, business ethics, and execution performance.

Read Free Gartner Research

Category: apache  hadoop  cloudera  data-and-analytics-strategies  dataguise  gartner  ibm  magic-quadrant  oracle  security  

Tags: apache  hadoop  axis-technology  big-data-2  cloudera  compuware  dataguise  ibm  informatica  open-source  oracle  oss  security  voltage  

Merv Adrian
Research VP
9 years with Gartner
40 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio

Thoughts on Data Security for Hadoop – Add-on Choices Proliferating

  1. […] Data Security for Hadoop – Add-on Choices Proliferating […]

  2. Ron Indeck says:

    Thank you for your timely article.

    We agree that data governance for Hadoop is exceptionally important and an often overlooked requirement. As we review customer requirements for large-scale, real-time analytics projects, they often express the concern that the security mandatory for good enterprise compliance and governance is missing within Hadoop. This includes customers exploring public, hybrid, and private cloud deployments.

    One approach gaining momentum is to obfuscate data before landing in Hadoop. A workable solution for most practical use cases consists of securing data in flight – masking sensitive fields (whether PII/PHI/PCI or other precious data) and encrypting large blocks of critically sensitive data – before entering the distributed cluster. This can be accomplished effectively using a stream processing engine. For these applications this solution can perform NIST format-preserving masking/encryption at 10 million fields per second and bulk encryption at several GB/s. This enables users to access data in near real-time for advanced analytics, business advantage, and operational effectiveness without risking private customer or confidential company data.

    • Merv Adrian says:

      Thanks, Ron. There are many ways to approach the issue, and to be truly comprehensive, organizations should be thinking about almost all of them to ensure there are no weak links. Look forward to our next chat about what Velocidata is dong.

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.