Gartner Blog Network


January 2018 Hadoop Tracker

by Merv Adrian  |  January 3, 2018  |  Comments Off on January 2018 Hadoop Tracker

Last month’s update was obsolete before it published. This often happens because of multiple moving parts and my extended gestation period. I needed to correct entries for both AWS and Hortonworks. The new Tracker is correct as far as I know as of January 2, 2018. Enjoy.

Shifts in the”common core” projects mean it’s time to for me to recast it soon. For example, Hortonworks’ go-forward packaging strategy will have an impact. Hortonworks plans to deprecate Flume, and move Kafka to a different product package, when they ship their 3.0 release. Amazon does not today directly support several of the projects (including – again – Flume), for which AWS EMR provides its own alternative offerings. Google ships Sqoop2, not the “other” one. Cloudera’s Sentry is part of a different security stack from Hortonworks’. And so on. This is normal product maturation as vendors differentiate their offerings from one another. The “disaggregation” of what was the “common Hadoop stack” continues, and many of the clients I talk to are adding other Apache and non-Apache pieces into their stacks as uses dictate and/or vendor offerings contain them.

Some projects are still more “common” than others – arguably, I should add Apache Tez in, because 4 of the 5 support it now. Zeppelin has 3 supporters – it’s borderline. And so on. And some projects are clearly losing momentum – Mahout comes to mind (and yes, please send me outraged comments, if you have some, about my saying so!) While I’ve tried to keep the tracker compact, these ongoing changes are rendering it a bit less useful. I’ll make some changes soon, and discuss in more detail in another post.

Screen Shot 2018-01-02 at 4.49.31 PM

Category: amazon  elastic-mapreduce  amazon-web-services  apache  avro  flume  hadoop  hbase  hdfs  hive  kafka  mapreduce  oozie  pig  spark  sqoop  tez  apache-yarn  apache-zeppelin  zookeeper  big-data  cloudera  gartner  google  hortonworks  mapr  open-source  

Tags: apache  flume  hadoop  hbase  hdfs  hive  mapreduce  oozie  pig  spark  yarn  zookeeper  cdh  cloudera  gartner  hortonworks  

Merv Adrian
Research VP
5 years with Gartner
38 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio




Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.