Gartner Blog Network

Hadoop Project Commercial Support Tracker July 2016

by Merv Adrian  |  July 30, 2016  |  2 Comments

There are now 15 projects supported by all 5 distributors I track, and several have had new releases since April. Kafka is the newest addition, and I believe the remaining 4-supporter offerings, Mahout and Hue, will remain unsupported by IBM, who has its own alternatives.

So: what is “broadly supported” Hadoop in April 2016? The Apache Hadoop web site still names Hadoop Common, Hadoop Distributed File System(HDFS™), Hadoop YARN and Hadoop MapReduce as the core components, and gives them a common release number. I leave Common out and call that 3 projects. (Apache still lists Cassandra as a “related” project; I do not include it here, since none of the distributions include it.)

The rest of the “projects supported by all 5 distributors” group: AvroFlume, HBaseHive, Kafka, Oozie, ParquetPig, SolrSpark, Sqoop and Zookeeper – adds to 15 projects now supported by all. (Note that there is a Sqoop2 project underway which is not compatible with the existing version. It’s not included here – MapR does, however, support it.)Mahout and Hue are the next group – “supported by  4.” That gets us to 17 projects, unchanged from last time.

There are 15 projects supported by two or three distributors, and ‘ll discuss them in my next post. Following that is the “one or none” group – another 30 projects I try to follow as well as I can. This gives us a total of 62 projects in all – a nightmare if you’d actually like to compose a stack for yourself, as I discuss for Gartner clients in  Hadoop Project Proliferation Challenges Selection and Support.

As always, the chart below is based on conversations with and/or web documentation from Amazon, Cloudera, Hortonworks, IBM, and MapR. Distributors’ public documentation of supported distribution contents remains variable and sometimes lags actual releases; see Amazon’s page,  Hortonworks’ page,  IBM’s page and MapR’s page for their details.

Screen Shot 2016-07-30 at 11.16.09 AM

Additional Resources

Predicts 2019: Data and Analytics Strategy

Data and analytics are the key accelerants of digitalization, transformation and “ContinuousNext” efforts. As a result, data and analytics leaders will be counted upon to affect corporate strategy and value, change management, business ethics, and execution performance.

Read Free Gartner Research

Category: apache  avro  cassandra  flume  hadoop  hbase  hdfs  hive  kafka  mapreduce  oozie  apache-parquet  pig  solr  spark  sqoop  apache-yarn  zookeeper  cloudera  data-and-analytics-strategies  gartner  hortonworks  hue  ibm  biginsights  mapr  open-source  

Tags: apache  flume  hadoop  hbase  hdfs  hive  mapreduce  oozie  pig  sqoop  yarn  zookeeper  big-data-2  biginsights  cassandra  cdh  cloudera  hortonworks  ibm  mapr  

Merv Adrian
Research VP
9 years with Gartner
40 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio

Thoughts on Hadoop Project Commercial Support Tracker July 2016

  1. Carl says:

    Very handy reference. Bookmarked!

  2. Merv Adrian says:

    Thanks, Carl. It started out that way for me just to keep track myself, but since I started putting in the blog many people have told me so.

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.