Stack expansion has ground to a halt. The last time an Apache project was added to the list of those most supported by leading Hadoop distribution vendors was July 2016, when Kafka joined the other 14 then commonly included. Since then, no broad support for new projects has emerged.
As of March 2017, Google Cloud Dataproc has been added to the tracker, and the number supported by (now) 6 or 5 distributors has dropped back to 14. Google supports 6.
Another addition this time is a row at the top of the tracker counting identifiable, directly used projects that the vendors list. Some, like Apache Calcite, are generally used primarily by other components rather than by users directly, so they are not listed.
Coming soon: what differentiates the players’ packages is what is not in common. Which pieces are those?
Updated twice 3/17
Cloudera Kafka version corrected, Amazon EMR HBase version corrected (note also that Amazon’s includes S3 support), Hortonworks Hive and Kafka versions updated.
Category: amazon-web-services apache accumulo ambari avro flume hadoop hbase hdfs hive kafka mapreduce oozie apache-parquet pig zookeeper big-data biginsights cloudera elastic-mapreduce gartner google hortonworks ibm mapr open-source
Tags: amazon apache flume hadoop hbase hdfs hive mapreduce oozie pig spark sqoop yarn zookeeper big-data-2 biginsights cdh cloudera gartner hortonworks ibm mapr microsoft
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.