Gartner Blog Network


Symposium Notes – Day Four Returns to Data Security, and to Hadoop

by Merv Adrian  |  October 26, 2016  |  1 Comment

Thursday, the final day, reinforced a theme for the week: data security is heating up, and organizations are not ready. It came up in half of today’s final 10 meetings.

“Is my data more secure, or less, in the cloud?”

“Does using open source software for data management compromise how well I can protect it?”

“I’m a public utility – can I put meter data in the cloud safely? What about if it is used to drive actions at the edge?”

“I’m using drones for mapping and the data is in the cloud – am I exposed?”

And the final conversation of the day revolved around competing roles – who has authority for securing this new data, and to whom do they report? And does “authority” equal “responsibility” and/or “accountability”? For organizations that have not really thought about this, the overlaps, conflicts, ambitions and procedures must be completely considered. Key principle for me: you can’t outsource responsibility.

Additional Hadoop questions came up as well.

  • An investor asked about “braintrust” departures from a leading Hadoop distribution vendor. This one was not so tough – it was only one or two people, and new opportunities and vesting schedules often have more to do with such changes than the status of the existing firm.
  • Several clients are wrestling with doing multiple things on the same on-premises infrastructure: “I want to use a dozen of the nodes for test and dev, some for Spark jobs and some for HBase work (which is variable and needs to scale up and down at different times.) It turns out that is really hard for us to manage.”
  • One client invoked organizational questions that blended with the technical: accommodating different groups’ requirements, and balancing resources across them, without building a cluster for each – “can’t I build a shared service that provides them all with what they need depending on their needs?” Not so easily, yet, is the answer.
  • An early adopter with a sizable cluster is trying to decide how to deal with scale that is literally straining physical capacity – it may be that alone that drives her to the cloud.
  • For another, ramping ingest (up, and then down, because it’s not continuous, from multiple sources at different unpredictable times) is proving maddeningly detailed and fragile.
  • An excellent economic question related to whether cloud-based scaling has value revolved around this opening: “I just paid my legacy DBMS vendor over $3M to do the exact same thing I’m already doing, but faster. If I’d done it in the cloud, it likely would have been less costly (if I could scale compute when I need to – but they can’t do that yet.) How long will I have to wait?”

These are the questions of organizations trying to get past those first pilots, and they help us, perhaps, to understand why production for big data has stalled according to Gartner research.

For me, this Symposium was the call to a significant research agenda in 2017. No event does that better, although our Data and Analytics Summits, coming in the spring, come very close.

Category: hadoop  hbase  spark  big-data  data-integration  data-lake  gartner  security  trends-predictions  

Tags: hadoop  hbase  spark  cloud  data-security  gartner  

Merv Adrian
Research VP
5 years with Gartner
38 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio


Thoughts on Symposium Notes – Day Four Returns to Data Security, and to Hadoop


  1. […] “Symposium Notes – Day Four Returns to Data Security, and to Hadoop,” by Gartner analyst Merv Adrian. Data security is heating up, and organizations are not ready. Key principle: you can’t outsource responsibility. […]



Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.