Gartner Blog Network

December 2017 Tracker – Where’s Hadoop?

by Merv Adrian  |  December 29, 2017  |  Comments Off on December 2017 Tracker – Where’s Hadoop?

The leading 2017 story of Hadoop distributions is that nobody seems to want to be accused of being in the business of providing them. Some former champions are expanding their shiny new positioning: Cloudera is selling Enterprise Data Hubs and Analytic DBs; Hortonworks offers DataPlanes and Next-Gen Data Platforms; MapR touts the Converged Data Platform. In the cloud world, Amazon’s EMR is at least designed to “run and scale Apache Hadoop, Spark, HBase, Presto, Hive, and other Big Data Frameworks” while on Google’s Cloud Platform page the word Hadoop appears once inside the description of Cloud Dataproc.

I’ve talked in Gartner research about the “disaggregation of the stack” – meaning the use of some of the components but not all, combined with other Apache and non-Apache pieces, by organizations that want to compose their own stack rather than buying someone else’s. (Gartner clients can find that in our Hype Cycle for Data Management.) The former Distributors are doing the same, adding products based on other Apache projects or their own pieces to generate additional revenue streams, partnering with third parties to create go to market offerings targeted at specific use cases, and in some cases, getting out of the business entirely. Since my March blog post, IBM has become the latest to get out of the distro business, dropping the IBM Open Platform and inking a deal with Hortonworks that bodes well for the latter’s strategy built around Atlas (which IBM will also sell.) Of course the providers still sell Hadoop distributions, but as in any maturing market, they need to broaden their story and focus on business outcomes to meet mainstream buyers where they live. Its normal, and it’s appropriate – and hopefully remunerative, since they are still losing money.

The tracker continues to be relatively useful, I hope, for those getting their first projects together – it tells you which versions of Apache projects are in the Hadoop distributions named. All the vendors have other bits the others do not support, and therein lies the competitive differentiation they seek. In some early posts next year, I hope to discuss some of the many other Projects seeing investment and buyer interest. Have a happy New Year, and keep watching.

Edited December 29 for AWS release

Screen Shot 2017-12-29 at 11.41.46 AM

Additional Resources

Complete Your Data and Analytics Strategy With a Clear Value Proposition

As a data and analytics leader, one of the most important things to articulate in your strategy is the value proposition. Learn how to create a modern, actionable D&A strategy that creates common ground amount stakeholders.

Read Free Gartner Research

Category: amazon  elastic-mapreduce  amazon-web-services  apache  atlas  avro  flume  hadoop  hbase  hdfs  hive  impala  kafka  mapreduce  oozie  spark  sqoop  apache-yarn  zookeeper  cloudera  data-and-analytics-strategies  dbms  gartner  google  hortonworks  hue  ibm  industry-trends  mapr  open-source  presto  

Tags: amazon-elastic-mapreduce  amazon-web-services  apache  accumulo  apache-ambari  avro  flume  hadoop  hbase  hdfs  hive  kafka  mapreduce  oozie  parquet  pig  zookeeper  big-data-2  cloudera  gartner  google  hortonworks  ibm  ibm-biginsights  mapr  open-source  

Merv Adrian
Research VP
9 years with Gartner
40 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.