In the previous post, Prediction Models: Traditional versus Machine Learning, we looked at 3 kinds of prediction models and clarified the difference between traditional and machine learning models for prediction. In this post we’ll see that machine learning prediction models excel in conditions in which other prediction models suffer.
Key Characteristics of Prediction Models
Let’s look at how each type of prediction model (Traditional Statistical, Traditional Rules-Based, and Machine Learning) satisfies some key aspects that characterize them. Specifically, we’ll look at the the following:
- What’s the method for determining the model’s optimal parameter values?
- Does the method for determining the model’s optimal parameter values need to be programmed by humans?
- Can the method for determining the model’s optimal parameter values handle data sets with a large number of rows and/or columns?
- Can the method for determining the model’s optimal parameter values
The table below captures the differences between the prediction models.
Determining Parameter Values
Recall from the previous post that the optimal parameters are the values that enable a model to make specific predictions. When the parameters of a model are not set, the model is a formula that is incomplete — the form of it is complete, but there’s not enough information to calculate a number based on the formula. Here’s an example:
a * 16 + b * 12 = ?
We don’t know the answer to this equation until we know the specific values of the parameters a and b.
While the traditional statistical models require a lot of creativity (and statistical sophistication) to determine the parameters, the most common algorithm that machine learning models use for this purpose is called gradient descent. This algorithm is conceptually simple and I’ll explain it in a future post.
The gradient descent algorithm proceeds iteratively and ultimately discovers the optimal parameter values. But don’t confuse an algorithm which is programmed to discover with an algorithm that “writes itself”. Sometimes you hear that machine learning is all to do with computers writing their own code; as row 2 of the table above points out, this claim is fantasy.
Where Machine Learning Models Excel
The last 3 rows of the table above outline the conditions in which machine learning models thrive while traditional models struggle. The machine learning prediction approach is particularly suited to data sets that:
- Have a large number of columns (each data point has a large number of attributes)
- Have a combination of categorical, numerical, and textual (or image, audio, video) data
It pays to try machine learning prediction models when you face these conditions. Especially if other methods haven’t been able to make reliable predictions and there is a lot of business value gained in beating the existing prediction benchmark.
Bonus points if the marginal business utility of beating the benchmark prediction accuracy is also high. This means that even small improvements to the existing benchmark are valuable.