Gartner Blog Network

The New World After Big Data Changes It

by Nick Heudecker  |  July 24, 2014  |  7 Comments

Guest Post from Mark Beyer

We all understand that big data will “become normal” and expected sometime near 2018. The disruption of the new processing style and the resulting infrastructure now has to mature. The new vendors that have emerged are proclaiming their victory over tradition. At the same time they are scurrying to build out the robust administrative, productivity, deployment and optimization framework that already exists in traditional data management practices. I use the word “scurrying” quite on purpose. There are some heavy traditional boots stomping around in the data management kitchen and some mice will be ducking and running to hide in the corners—and starve. A few brave mice will run to the center of the kitchen, steal some of the data management cheese and avoid the spring-loaded trap waiting for them.

With this new phase of big data adoption and evolving the technology, there are two looming questions.

First, when will big data technology and techniques become part of normal data management? We did big data many times before. Client/server and relational databases did it. Remember, flat files were bloated, processing code was overly complex and difficult to manage and the centralized mainframe computing system was costly. Voila, we fixed it. And it took almost twenty years to mature. Remember the first relational databases mounted on client/server infrastructure. Ugh. Then it happened again. Too many applications that create their own copies of data, how can we reconcile this mess? So, we built these things called data warehouses (but not before we tried Executive Information Systems). And giant data warehouses with billions of rows of data and hundreds if not thousands of users began demanding more and more from them. And it took more than twenty years for everyone to figure out the data warehouse (remember, Sabre was 1976, Frito Lay and Coke had their data warehouses LONG before Kimball and Inmon popularized the terminology). XML was going to fix data transfer rates. We are approaching twenty years there too.

Almost everyone equates Hadoop with Big Data, so let’s trot out that timeline too. Open-sourced in 2005, we are now ten years into the maturity cycle. Right on schedule about three years ago, the hype machine cranked up—in the sixth year of emergence. Here we are in year ten, and everyone is now demanding more robust development, deployment management, optimization consistency. We like to think that IT is cranking out new technology faster and faster—but it isn’t. It’s twenty years or bust. When examining the current information management market—who will succeed faster? The traditional, already mature and robust environment that needs to add new forms of processing and information types into its existing management approach; or, the new processing and asset management system that has to build twenty years of maturity in the next three years or get caught in the light in the middle of the kitchen?

Second, how will big data technology and techniques mature and take its rightful role in the information infrastructure and processing world? What is that rightful role? I argue that in the traditional analytics world getting the requirements correct was the key. In the new world, letting the analysts determine the requirements through usage is the key. Why not take advantage of all of this hyper fast hardware and networks? Let’s face facts, the only reason ANYONE captures information is to share it at some point—otherwise why capture it? So, the users are the analysts and operations teams who need to share what was done about their part of the business process with someone or something else or to see what was done somewhere else by someone else. Users fit into four big categories: casual users who want clean data that they don’t have to think too much about because someone else already thought of what it should look like; analysts believe they can manipulate data, but they really only manipulate the data they have within a business process model they are familiar with; data miners understand data, sourcing processing logic and more and there are very few of them in any organization (although many analysts think they are miners); and, data scientists who are, well, different than miners because they can geek speak about mathematics, business processes and data simultaneously while they are creating graph analytics in their heads.

This gets to the role that big data technologies will play in the new world of information infrastructure and management. It will be the job of this technology to render and evaluate new candidate models of analysis. The miners and scientists will play in this space and have a field day. But they will use their tools and the data to develop all viable uses of data and then the scientists will become bored or otherwise engaged. So, the miner who helped develop all of these wonderful candidates and thinks to themself, “Wow, we should use these,” will want to show the Analysts their pretty analytics candidates. And IT will say, wait, you are putting a very sharp object into a Toddler’s nimble fingers. So IT will develop semantic tiers to present the many candidates and track which are the highest ranking contenders for optimization. The analysts will marvel at all the shiny new models and then start picking the primary contenders that best represent likely or interesting scenarios. The analysts will use different candidates until they develop those contenders and eventually casual users will wonder into the contender world and ask, “Can I simply have that ONE model there? I like that model—it is the best compromise for all of us.” And IT will put that into the data warehouse and data marts for all to see.
Think, 20 years or bust (and we are in year ten). Think candidates, contenders and compromises (and supporting all three from now on).

This is the new world after big data changes it.

For supporting information on this topic, Gartner clients can access: The Data Lake Fallacy: All Water and Little Substance

Additional Resources

Complete Your Data and Analytics Strategy With a Clear Value Proposition

As a data and analytics leader, one of the most important things to articulate in your strategy is the value proposition. Learn how to create a modern, actionable D&A strategy that creates common ground amount stakeholders.

Read Free Gartner Research

Category: data-and-analytics-strategies  

Nick Heudecker
Research Vice President
5 years at Gartner
19 years IT Industry

Nick Heudecker is an Analyst in Gartner's Research and Advisory Data Management group. Read Full Bio

Thoughts on The New World After Big Data Changes It

  1. […] Source: The New World After Big Data Changes It […]

  2. Dan Graham says:

    Good article Mark.

    Some questions on the time lines:
    Cloudera CDH1 is first shipped in March 2009 yet you said we are 10 years into Hadoop. Seems to me Hadoop/BigData is just over 5 years old, especially at Apache. Google’s paper kicked off Hadoop but to me first shipment and customer installs are the real beginning. Before that, just theory and Google wasn’t sharing code. Where are you starting this timeline?

    Similar question on data warehouses. You said “We are approaching twenty years there too.” Data Warehouses research paper by Devlin is 1988. Teradata began shipping marts in 1984, warehouses in 1990. EIS systems predate that. Where are you starting the clock on DWs to get 20 years? Seems like marts and warehouses are much more mature than 20 years.

    Loved the article. Common sense is not so common.

  3. Mark,

    In tech circles we’re always pressing to progress faster and faster- most times without any real idea of what that means. There is a certain satisfaction when you look up from the hurly burley and find yourself just where you expected to be. The race is still on and pressure always continues to build.

    I am a keen fan of your idea of the analytics defining the requirements– seems spot on in terms of looking for maturity milestones. Manual manipulation, IT commanding and controlling through circuitous processes seems like clinging to past with a greater concern for other extraneous factors. Virtualization brought autonomic styles to computation cycles, hopefully in-memory brings autonomic styles to analytics and serves as a milestone for identifying analytic technology maturity:-)

  4. […] The New World After Big Data Changes It | Nick Heudecker’s Gartner Blog […]

  5. ashish says:

    Great article mark.

    The volume of data created every day is really exploding. Soon we need many more effective ways to store them. These volume of data can be useful when they are analyzed on proper business software and appropriate business rules. It can be magical how these analytics predict the consumer behavior.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.