Rob Addy

A member of the Gartner Blog Network

Rob Addy
Research Director
3.4 years at Gartner
17 years IT industry

Rob Addy is a research director in Gartner's Technology & Service Provider Research division, focusing on software and hardware support services across EMEA. Mr. Addy also covers the provision of desktop support services in an outsourcing context within the region ...Read Full Bio

Coverage Areas:

Prediction Provides Questions; Not Answers

by Rob Addy  |  February 6, 2012  |  4 Comments

December 2012 marks the end of a time period in the Mesoamerican Long Count calendar. Some believe this is because the world will end. Unix time ends on Tuesday, 19th January 2038. So assuming we are all still here next February, should we believe that the world will end in 2038? Did the POSIX committee know something that we don’t? Only time will tell…

Whilst here at Gartner Towers we may lack the popular following of Nostradamus, we do try to anticipate how the industries and markets that we cover will change over time – Our current prophecies for Product Support can be found in “Predicts 2012: Product Support Market Will Weather the Cloud-Based Storm and Emerge Driving Value“.

Prediction can be very useful. Although often it isn’t. It can also be highly distracting. But providing it is based upon an appropriate evidence base and a statistically relevant analytical model constructed to take account of likely failure modes, inter-dependencies and historical performance data then it can even, dare one say it, be useful.

Predictive Support services are slowly beginning to come to market. The ability to predict and prevent system failures and problems will become paramount in the future as analytics excellence becomes the battleground for support providers. The relative accuracy of analytical models and their ability to narrow the predicted window of failure to something usable will differentiate support offerings. Predicting system failures 3 seconds in advance is practically useless. Predicting system failures 30 seconds in advance is marginally better. A predictive warning of 3 minutes plus opens up a whole heap of non-egg-boiling-related possibilities. Predicting that an issue will occur between 2pm and 4pm next Wednesday afternoon is incredibly useful.

The following graphic shows some of the many potential ingredients of the predictive support analytical pie…

Note: Some “ingredients” are only available from specialist suppliers and consequently not all analytical pies will taste the same. Ommiting some of the ingredients may or may not affect the culinary integrity of the pie and its ability to satisfy those with a hunger for prevention-based services :-)

Analytical models will incorporate a wide variety of data feeds. The hunger and perceived need for more and more data upon which to perform statistical analysis will lead to high levels of over monitoring and over collection in the short term with a gradual scaling back of data requirements as providers learn what it is that they actually need to track in order to predict issues with the levels of accuracy that they actually need. Organizations that are overly focused on developing the perfect analytical model with 100% accurate predictions at the component level will be overtaken by providers willing to play the odds and offer commercial terms based around less detailed / granular models that deliver sufficiently accurate predictions to be able to initiate appropriate actions to avoid or mitigate service impacting events.

First generation predictive models won’t necessarily prevent incidents. This is particularly true in the software support arena where it is currently impractical to swap out a defective piece of code during run-time. However, predictive analytics still has a massive role to play in software support. One of the biggest problems facing providers when supporting complex software environments is the lack of evidence surrounding any particular failure or crash. When it all hangs, the data that you need to help troubleshoot the issue and prevent it happening again is typically lost. Prediction will enable the automatic initiation of low level logging immediately prior to system failures. This will capture valuable data that will speed the diagnosis and resolution phase as well as providing a basis upon which to develop preventive actions.

But prediction isn’t just about avoiding system outages. It has many many more uses than this. Some of these uses relate to the customer experience, others will help improve the operational performance of the support provider and enable it to make better commercial decisions. “Emerging Technology Analysis: Predictive Support Services” describes 9 use cases for predictive analytics within a support services context in detail.

The real question about prediction is not how you can achieve it. You can. But what you would do with those predictions if you could make them? The mathematicians, statisticians and analytical modellers will deal with the technicalities of creating meaningful and accurate predictions. Business leaders must then decide what it is that they intend to do with them thereafter!

Prediction is just another tool. And we should always remember that a fool with a tool is still a fool. But if we use the tool wisely then perhaps just maybe the future will be ours…

TRKFAM!

4 Comments »

Category: Support Processes Support Strategy Technologies Underpinning Support     Tags: , , , , , , , ,

4 responses so far ↓

  • 1 Tony   February 13, 2012 at 2:03 am

    “Predicting system failures 3 seconds in advance is practically useless”

    I don’t agree. The ability to predict disaster 3 seconds ahead and ensure conditions leading up to the apocalypse are logged and safely stored is invaluable, to correct the current issue and hopefuly resolve the underlying cause.

  • 2 Rob Addy   February 13, 2012 at 9:08 am

    Hi Tony

    Providing the 3 second warning gives one time to initiate a log thread and for all of the pre-requisite configuration and environmental data to be captured (including repeating flip flopping data points or error traps which may or may not have a cycle time that fits into the available window) and it can be collected and written to the log files then yes I agree that it may indeed be very useful. But these are some pretty big assumptions given the complexity of modern business systems and consequently I remain somewhat sceptical of the value of such a condensed timeline.

    If one wants to go beyond pre-crash data collection and actually prevent the issue by killing and restarting process threads or by releasing database locks etc then the 3 second window is highly likely to be insufficient…

    As with most things, there isn’t a black and white answer… 3 seconds might be good enough but then again it might not. Depending on the technology type, likely remedial or contingent action requirement and the logistical constraints for implementing said action then the necessary prediction window will vary. The further out the prediction and the more tightly bounded the window the better… :-)

    Many thanks for your comment Tony!

    Rob

  • 3 Wipro Council for Industry Research   February 22, 2012 at 3:47 am

    Using analytics to study past performance is no longer going to show results – what is needed is to use predictive analytics for formulating strategies; find out the impact of changes before they happen! For example, what a retailer actually “does” with a piece of analysis like “7% of customers were responsible for 43% of the sales” – matters more than the actual statistic itself. We have done a study on application of predictive analytics in retail, you can access the findings at “http://www.wipro.com/Documents/ris_wipro_wp_0811_F.pdf”

  • 4 Rob Addy   February 22, 2012 at 8:33 am

    Many thanks for the infomercial ;-) I was about to consign the comment to the “spam bucket” but seeing as it reinforces my point I figured I’d let it stand.

    I agree that a rear view mirror only approach to analytics is outdated and sub-optimal.

    I also agree that analytics without action is pointless.

    But where I am at odds with you here is that your paper is still talking in generalities around customer segments and broad brush plans of action. From my perspective this is “proactive” in terms of the Gartner Product Support Maturity Scale but it is not “predictive”. Unless you make it personal (i.e. for a specific technology instance / implementation of a specific customer) and pertinent (i.e. real time) then for me it’s not “predictive”. I’m not saying that it doesn’t have value – after all, proactive is still “better” than “reactive” but it’s unlikely to avert the issue for all potential victims due to the asynchronous time lag issues and the scatter gun effect of generalization…

    :-)

Leave a Comment