Gartner Blog Network


Hadoop Is A Recursive Acronym

by Merv Adrian  |  October 13, 2014  |  3 Comments

Hopefully, that title got your attention. A recursive acronym – the term first appeared in the book Gödel, Escher, Bach: An Eternal Golden Braid and is likely more familiar to tech folks who know Gnu – is self-referential (as in “Gnu’s not Unix.”) So how did I conclude Hadoop, whose name origin we know, fits the definition? Easy – like everyone else, I’m redefining Hadoop to suit my own purposes. 

Let’s start with the obvious one. Of course, Doug Cutting named Hadoop after his child’s toy elephant, seen here.

Photo: Merv Adrian

Photo: Merv Adrian

 

And in its early days, as I discussed in my post about the changing composition of distributions a few months back, the story was simpler. Hadoop was HDFS, MapReduce and some utilities. As those utilities got formalized and became projects themselves and were supported by commercial distributors, the list grew: Pig, Hive, HBase, and Zookeeper were Hadoop too. And a few months ago, as I noticed, Accumulo, Avro, Cascading, Flume, Mahout, Oozie, Spark, Sqoop,  and YARN had joined the list.

YARN is the one that really matters here because it doesn’t just mean the list of components will change, but because in its wake the list of components will change Hadoop’s meaning. YARN enables Hadoop to be more than a brute force, batch blunt instrument for analytics and ETL jobs. It can be an interactive analytic tool, an event processor, a transactional system, a governed, secure system for complex, mixed workloads. At Strata this week, we’ll talk about its integration with Red Hat’s middleware, its cautious alliance with Spark for MapReduce replacement, its alliance with data wrangling tools from startups and Teradata, its connection, via Sentry, to security stacks… and more.

So yes, many of us are redefining Hadoop as we add new pieces – new use cases, new projects that change its very nature. My answer to “What is Hadoop”?

Hadoop
And
Diverse
Other
Operating
Platforms

OK – it’s a bit cute. But hopefully, it got your attention. Hadoop’s journey is just beginning, and there is much more change ahead.

Category: apache  accumulo  flume  hadoop  hbase  hdfs  hive  mahout  mapreduce  oozie  pig  spark  sqoop  apache-yarn  zookeeper  big-data  cascading  gartner  teradata  

Tags: apache  flume  hadoop  hbase  hdfs  hive  mapreduce  oozie  pig  sqoop  zookeeper  gartner  teradata  

Merv Adrian
Research VP
5 years with Gartner
38 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio


Thoughts on Hadoop Is A Recursive Acronym


  1. […] Hadoop Is A Recursive Acronym [Gartner Blogs] […]

  2. […] ガートナーのアナリスト、マーブ・エイドリアンは出てきた頃のHadoopを取り上げ、次のように説明する。 […]



Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.