by Merv Adrian | January 3, 2018 | Comments Off on January 2018 Hadoop Tracker
Last month’s update was obsolete before it published. This often happens because of multiple moving parts and my extended gestation period. I needed to correct entries for both AWS and Hortonworks. The new Tracker is correct as far as I know as of January 2, 2018. Enjoy.
Shifts in the”common core” projects mean it’s time to for me to recast it soon. For example, Hortonworks’ go-forward packaging strategy will have an impact. Hortonworks plans to deprecate Flume, and move Kafka to a different product package, when they ship their 3.0 release. Amazon does not today directly support several of the projects (including – again – Flume), for which AWS EMR provides its own alternative offerings. Google ships Sqoop2, not the “other” one. Cloudera’s Sentry is part of a different security stack from Hortonworks’. And so on. This is normal product maturation as vendors differentiate their offerings from one another. The “disaggregation” of what was the “common Hadoop stack” continues, and many of the clients I talk to are adding other Apache and non-Apache pieces into their stacks as uses dictate and/or vendor offerings contain them.
Some projects are still more “common” than others – arguably, I should add Apache Tez in, because 4 of the 5 support it now. Zeppelin has 3 supporters – it’s borderline. And so on. And some projects are clearly losing momentum – Mahout comes to mind (and yes, please send me outraged comments, if you have some, about my saying so!) While I’ve tried to keep the tracker compact, these ongoing changes are rendering it a bit less useful. I’ll make some changes soon, and discuss in more detail in another post.
Read Complimentary Relevant Research
How to Create a Data Strategy for Machine Learning-Powered Artificial Intelligence
MLpAI can help deliver systems with more automation and less human intervention, but success requires a data strategy to deal with the...
View Relevant Webinars
Big Data Architectures: Comparing Relational and NoSQL Databases
In the big data arena, few choices are more important and impactful than the persistent data store. Relational and nonrelational databases...
Category: amazon elastic-mapreduce amazon-web-services apache avro flume hadoop hbase hdfs hive kafka mapreduce oozie pig spark sqoop tez apache-yarn apache-zeppelin zookeeper big-data cloudera gartner google hortonworks mapr open-source
Tags: apache flume hadoop hbase hdfs hive mapreduce oozie pig spark yarn zookeeper cdh cloudera gartner hortonworks
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.