Publisher O’Reilly Media recently released its 2014 Data Science Salary Survey.
Whether you accept the specific details of the survey or not, It’s maybe not really too surprising that Hadoop/Hive/Python/R programmers are getting paid most – surely basic demand/supply rules apply.
However, what’s interesting for me is the underlying dynamics that may be influencing the demand/supply. I’m going to suggest that it may be a feature of the overall Total Cost of Ownership (TCO) ratio of product to labour, based on the maturity of the tools.
During Bleeding-Edge phase, the “Products”, while having a low purchase cost (even “free”) are still relatively immature – sometimes not even proper products yet. (in the Data Science space right now, describing the available technologies as “Tools” might even be stretching it at times; I wouldn’t class a programming language as a tool…) So to compensate, you need skilled developers to get even relatively simple stuff done, and there are too few of them. Salaries go up.
Moving into Leading-Edge, as things get properly turned into true products and applications, then more money gets spent on the products (product licensing costs increase with the sophistication as complexity gets baked-in). More people are learning the skills that are in demand, because there’s a market for them, so the skills become commodity. while at the same time productising things means that deployment of solutions starts to become de-skilled. Salaries come down.
By the time you’re into Mainstream phase, product prices start to even out with competition in the market, and the skills levels also even out as the next cycle starts and the whizz-kids move into whatever next bleeding-edge cool stuff is happening. Result is an equilibrium of product/salary mix in the TCO for the purchasing community.
To understand the impacts of early-adoption vs late-to-the-party deployment, watch the corresponding shifts in overall solution expenditure and the product/salary ratio, not just the raw salaries.
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.