Blog post

Hadoop Tracker – March 2017

By Merv Adrian | March 16, 2017 | 0 Comments

open sourceMapRIBM BigInsightsIBMHortonworksGoogleGartnerClouderaApache ZookeeperApache PigApache ParquetApache OozieApache MapReduceApache KafkaApache HiveApache HDFSApache HBaseApache HadoopApache FlumeApache AvroApache AmbariApache AccumuloApacheAmazon Web ServicesAmazon Elastic MapReduceData and Analytics Strategies

Stack expansion has ground to a halt. The last time an Apache project was added to the list of those most supported by leading Hadoop distribution vendors was July 2016, when Kafka joined the other 14 then commonly included. Since then, no broad support for new projects has emerged.

As of March 2017, Google Cloud Dataproc has been added to the tracker, and the number supported by (now) 6 or 5 distributors has dropped back to 14. Google supports 6.

Another addition this time is a row at the top of the tracker counting identifiable, directly used projects that the vendors list. Some, like Apache Calcite, are generally used primarily by other components rather than by users directly, so they are not listed.

Coming soon: what differentiates the players’ packages is what is not in common. Which pieces are those?

Updated twice 3/17
Cloudera Kafka version corrected, Amazon EMR HBase version corrected (note also that Amazon’s includes S3 support), Hortonworks Hive and Kafka versions updated. 

Screen Shot 2017-03-17 at 2.09.44 PM


Comments are closed