My presentation on big data for the upcoming BI Summit in Barcelona is obsolete. In this presentation, I use the Gartner Hype Cycle curve to show that big data is at the peak of inflated expectations. And, as it happens with quickly developing technologies, I am already behind and big data goes ahead.
Last several weeks show that big data is falling into the trough of disillusionment. I realized it earlier today, when I was describing a recent Elephant Riders meetup to my colleagues at Gartner. MapR, HortonWorks and Cloudera were debating the state of Hadoop. And I heard from the very core of the Hadoop movement that MapReduce has always been Hadoop’s bottleneck or that Hadoop is “primitive and old-fashioned.” This is the video of the event. If you watch it, you can notice more points, which signal the beginning of disillusionment (and get a lot of useful information too). Congratulations, big data technology is maturing fast!
Meanwhile, my most advanced with Hadoop clients are also getting disillusioned. They do not realize that they are ahead of others and think that someone else is successful while they are struggling. These organizations have fascinating ideas, but they are disappointed with a difficulty of figuring out reliable solutions. Their disappointment applies to more advanced cases of sentiment analysis, which go beyond traditional vendor offerings. Difficulties are also abundant when organizations work on new ideas, which depend on factors that have been traditionally outside of their industry competence, e.g. linking a variety of unstructured data sources. Several days ago, a financial industry client told me that framing a right question to express a game-changing idea is extremely challenging: first, selecting a question from multiple candidates; second, breaking it down to many sub-questions; and, third, answering even one of them reliably. It is hard.
Formulating a right question is always hard, but with big data, it is an order of magnitude harder, because you are blazing the trail (not grazing on the green field). At the upcoming BI Summit in Barcelona, I will facilitate a user round table exactly about this — From “Satisficing” to Satisfying Business Requirements. Validating answers is also a tough job — big data analytics deals with uncertainty: you do not deduct the number and say that the meaning of life is 42 — you get a proof of your hypothesis with a certain degree of confidence. And it is up to you to decide what level of confidence is satisfying and what is “satisficing.” (A “satisficing” solution is the first solution that appears good enough.)
Back to the trough of disillusionment. Or, rather, forward to the trough. To minimize the depth of the fall, companies must be at a high enough (satisficing) level of analytical and enterprise information management maturity combined with organizational support of innovation. Oops, I promised myself to be a reporter, not an analyst in my blogs.
The only consistent success, reported by my clients, is with log analysis using Splunk. Why? Because Splunk is a (nice) tool. And plateau of productivity will be reached when tools and product suites saturate the market. Meanwhile, according to the Gartner Hype Cycle, the next stop for big data is negative press. Does this blog post count as such?
Follow Svetlana on Twitter @Sve_Sic
Category: "Data Scientist" analytics Big Data big data market Crossing the Chasm data paprazzi EIM events Hadoop Information Everywhere innovation Local News Uncategorized Tags: BI Summit, big data, big data adoption, data paprazzi, data scientist, data spy, end users, hadoop, Hadoop distribution, Information Everywhere, innovation, Silicon Valley, vendors