In the Hadoop community there is a great deal of talk of late about its positioning as an Enterprise Data Hub. My description of this is “aspirational marketing;” it addresses the ambition its advocates have for how Hadoop will be used, when it realizes the vision of capabilities currently in early development. There’s nothing wrong with this, but it does need to be kept in perspective. It’s a long way off.
Start here: A Gartner Research Circle survey revealed that in 2013, big data projects went into production in less than 8% of enterprises – and Hadoop was by no means the only technology included in their count. In 2014, many more enterprises will go into production with their first Hadoop project or two, perhaps even pushing deployed Hadoop shops into low double digit percentages.
In those same shops, there are thousands of significant database instances, and tens of thousands of applications – and those are conservative numbers. So the first few Hadoop applications will represent a toehold in their information infrastructure. It will be a significant beachhead, and it will grow as long as the community of vendors and open source committers deliver on the exciting promise of added functionality we see described in the budding Hadoop 2.0 era, adding to its early successes in some analytics and data integration workloads.
So “Enterprise Data Hub?” Not yet. At best in 2014, Hadoop will begin to build a role as part of an Enterprise Data Spoke in some shops. Aspirations are good, but perspective helps. Don’t confuse vision with strategy. The enterprise data warehouse has a long life ahead of it; the synergistic addition of Hadoop to its evolution into the logical data warehouse is just beginning, and Hadoop’s role in operational workloads and event processing has yet to launch.