I have the privilege of working for the world’s leading information technology research and advisory company, covering information management with a strong focus for the past few years on an emerging software stack called Hadoop. In the early part of 2015, that particular technology is moving from early adopter status to early majority in its marketplace adoption. The discussions and published work around it have been exciting and controversial, so in this post (and a couple to follow) I describe three interlocking research perspectives on Hadoop: procurement (counting real money actually spent); plans (surveys of intentions to invest) and positioning (subjective interpretations of what the first two mean.)
Procurement Perspective: Hadoop is a (Very) Small Market Today
Gartner collects data about spending on technology, and recently published 2014 DBMS vendor revenues for license and maintenance in a recurring report for our clients: Market Share: All Software Markets, Worldwide, 2014. In that research, we describe $32,864M attributable to DBMS software vendors. Examining that data provides a useful perspective on where Hadoop is today. One of the vendors – Oracle – measures its DBMS revenue in 10s of billions of dollars, while 4 others – Microsoft, IBM, SAP and Teradata – do so an order of magnitude lower, in billions. Collectively, they represent 92.1% of actual spending in 2014 on DBMS.
Spending is real, and tangible – procurement is measured by spending. DBMS is only one part of the information management software market, which includes related disciplines (and separately counted revenue streams) like data integration and data quality, as well as hardware spending on storage and others. My work participating in that research helps me gain perspective. Hadoop vendor revenue exists two orders of magnitude down on this stack – the three leading independent Hadoop distribution players (Cloudera, Hortonworks and MapR) today measure their revenue in 10s of millions. None of them would make the top ten list – yet. To do so, they will have too generate something approaching $200M annually, joining firms like Software AG, Fujitsu, Progress Software and CA Technologies. At current growth rates, they are a couple of years away from that milestone.
Customer count is a frequent metric in these discussions, and it is also real and tangible in its effect on procurement. The megavendors on this list have hundreds of thousands of companies paying for their product. Further down the list, we again move through several orders of magnitude, with the Hadoop distributors today describing themselves in terms of hundreds of customers (for example, the only public company, Hortonworks, is now reporting over 400.) A similar model of growth applies here, and suggests a tiny penetration to date for Hadoop, representing massive potential upside.
Deployed systems or projects is also a useful guide, and here things become a little fuzzier from a measurement point of view. DBMSs are well established as general purpose platforms, and each licensed instance may be used by their customers for multiple systems. Today, most Hadoop adopters are putting their first few systems – or their first – into production. Revenue for the DBMS megavendors is attributable to both maintenance and support revenue from their installed base already running projects, and from new purchases to support new ones. Maintenance and support is an annuity to the vendors as long as the product is not replaced and over time becomes an increasing percentage of their revenue stream. The Hadoop market will begin to exhibit similar deployment in coming years, as Hadoop continues to expand its potential us cases. It has already moved from batch ETL into interactive analytics, and there is much more to come now that YARN is enabling many more uses. Again, substantial upside.
To sum up: a procurement perspective on Hadoop is that it is a tiny subset of the #33B DBMS portion of the information management market. It’s healthy, and growing, and has a enormous amount of upside adoption potential. It may show associated growth in revenue – though this is not yet clear. Commercial open source software revenue may not scale as linearly with deployment as commercial closed source software does. But that’s a topic for another post. In my next piece, I’ll cover the second perspective: plans. What do likely purchasers tell us about Hadoop, and what do those plans suggest about the next few years of procurement?
Read Complimentary Relevant Research
Organizing for Big Data Through Better Process and Governance
With big data past the Peak of Inflated Expectations on the Hype Cycle, organizations are addressing next-level challenges and asking,...
View Relevant Webinars
What Big Data Means Today and How to Position Effectively
Gartner's original prediction that the term "Big Data" would become meaningless by 2020 was actually a bit off its largely useless already...
Category: apache hadoop mapreduce apache-yarn big-data cloudera data-integration dbms gartner hortonworks ibm industry-trends mapr microsoft open-source oracle oss rdbms sap teradata trends-predictions
Tags: apache hadoop mapreduce big-data-2 cloudera data-integration gartner hortonworks ibm mapr microsoft oracle sap teradata
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.