In 2009, a CRM icon Tom Siebel was attacked by a charging elephant during an African safari. Ominously, this was exactly the time of changing epochs signified by another elephant, Hadoop. It was not obvious back in 2009 that the era of CRM came to the end and the era of data began. This very year of 2009, Cloudera announced the availability of the Cloudera Distribution of Hadoop, and MapReduce and HDFS became separate subprojects of Apache Hadoop. This was the year when people started talking about beautiful data.
The era of data is about the process of data commoditization, where data is becoming an independently valuable asset that is freely available on the market. A “commodity” is defined as:
- Something useful that can be turned to commercial or other advantage
- An article of trade or commerce
- An advantage or benefit
Looks familiar? That’s what we want data to become. And we are getting there, not very fast but steadily. Information patterns derived from data are already changing status quo; they disrupt industries and affect lives. Sometimes, data is useful, yet not turned to commercial advantage. For sure, data is increasingly becoming an article of trade or commerce. Notice, the number of new available Web APIs that give public access to data started growing explosively around the beginning of the era of data.
I am not the first one pointing to the commoditization of data. Bob Grossman, who epitomizes a data scientist to me, gave a detailed account of commoditization in his outstanding book The Structure of Digital Computing: From Mainframes to Big Data. In particular, the commoditization of time took most of the 17th century. We take our watches, clocks and phone timers for granted — think in a perspective: in the future, someone will take for granted access to all kinds of data. The last chapter of the Grossman’s book is entitled “The Era of Data.”
Open data is the strong manifestation of this new era. The first government open-data websites — data.gov and data,gov.uk — were launched in 2009. The government mandates and open data policies from multiple countries and public entities continue to contribute to the process of data commoditization. Openness has the benefit of increasing the size of the market. The greater the size of the market and the demand for a resource, the greater the competitive pressure on price and, hence, the increase in commoditization of the resource.
When data gets free or inexpensive (as a result of commoditization), the opportunity exists to unite people over data sets to make new discoveries and build new business models. Many companies choose Hadoop because it is a cheap data storage. This entry point is the first step on the journey to the data operating system, a term that I heard three times during past five days, notably from Doug Cutting who brought to the world Hadoop the elephant and the data operating system. This year’s Hadoop Summit starts today. It brought together 3,000 people from 1,000 organizations.
The last part of the “commodity” definition is “an advantage or benefit.” Gartner analysts Mark Beyer and Donald Feinberg predicted several years ago:
By 2014, organizations which have deployed analytics to support new complex data types and large volumes of data in analytics will outperform their market peers by more than 20% in revenue, margins, penetration and retention.
According to my observations, it’s true. If this is true for you? If not, be patient —an elephant’s pregnancy is almost two years long.
P.S. Tom Siebel survived the elephant attack. He is running a big data company C3 Energy now.
Follow Svetlana on Twitter @Sve_Sic