There are now 20 commonly supported projects: Avro, Flume and Solr join the group supported by all 5 distributors and other changes appear as well.
For this version of the tracker (last updated in December), I’ve made one sizable change: Pivotal has been dropped as a “leading distributor,” dropping the number to five. Pivotal relies on Hortonworks’ distro (as does Microsoft) as its commercial offering now. We first identified 6 broadly supported (4 or more distributors) “Hadoop” projects in 2012, 15 in June 2014, and 17 in December 2015. The threshold is now based on 3 supporters or more: with one less distro, the line moves down. Is there an end in sight? Hardly; for one thing, I expect IBM to add a couple of projects soon.
As always, the chart here is based on conversations with and/or web documentation from Amazon, Cloudera, Hortonworks, IBM, and MapR. Distributors’ public documentation of distribution contents remains variable; see Hortonworks’ page, IBM’s page and MapR’s page for their details.
So: what is “broadly supported” Hadoop in April 2016? The Apache Hadoop web site still names Hadoop Common, Hadoop Distributed File System (HDFS™), Hadoop YARN and Hadoop MapReduce as the core components, and gives them a common release number. I leave Common out and call that 3 projects.
We have 3 additions to the “projects supported by all 5 distributors” group this time: Avro, Flume, and Solr join HBase, Hive, Oozie, Parquet, Pig, Spark, Sqoop and Zookeeper – for a total of 14 projects now supported by all.
Kafka and Mahout join Hue in the next group – now “supported by 4.” That gets us to 17 projects.
DataFu, Impala and Cascading have 3 supporters. And now we’re up to 20 projects.
Here’s this quarter’s chart.
Read Complimentary Relevant Research
How to Create a Data Strategy for Machine Learning-Powered Artificial Intelligence
MLpAI can help deliver systems with more automation and less human intervention, but success requires a data strategy to deal with the...
View Relevant Webinars
Big Data Architectures: Comparing Relational and NoSQL Databases
In the big data arena, few choices are more important and impactful than the persistent data store. Relational and nonrelational databases...
Category: elastic-mapreduce amazon-web-services apache avro flume hadoop hbase hdfs hive kafka mahout mapreduce oozie apache-parquet pig solr spark sqoop apache-yarn zookeeper big-data cascading cloudera gartner hortonworks hue ibm mapr oss pivotal
Tags: amazon apache flume hadoop hbase hdfs hive mapreduce oozie pig sqoop yarn zookeeper big-data-2 cloudera gartner hortonworks ibm mapr microsoft oss
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.