Machine learning models are constructed differently from traditional quantitative models.
Two Types of Traditional Prediction Models
In the first type of traditional prediction model, the input data set along with statistical assumptions and calculations determine the prediction algorithm. The input data set is analyzed (or “fitted to the data”) using statistical techniques. The prediction algorithm that is the one best suited to describing the data as determined by the statistical analysis.
The second type of traditional prediction model uses an explicit set of rules (e.g., if X then Y) to transform the inputs into a prediction. Instead of the prediction algorithm being “discovered” through statistical calculations, these rules are usually ones that are known by experts in the prediction domain (e.g., the medical knowledge physicians have in diagnosing/predicting a disease).
Machine Learning Models
In contrast to the traditional quant prediction models, machine learning prediction models are developed in two steps.
In the first step, a machine learning model is trained. In this training step the input data, the historical results associated with these inputs, and a training algorithm are used to iteratively arrive at the prediction algorithm.
At the end of the first step, the model is “trained” and is now ready to make predictions.
In the second step, the prediction step, the trained machine learning model uses the prediction algorithm arrived at in the training step to transform new inputs into predictions.
You may have noticed that I’ve used “model”, and “algorithm”. Is there a difference? This gets us to some terminology that’s useful to have in mind when you read about machine learning.
A model is a mathematical expression that transforms inputs into outputs. The model by itself cannot be used to calculate a result. To do so requires fixing the model’s parameter values. In traditional approaches, the parameter values are fixed based on statistical calculations. In machine learning the parameter values are fixed in the process of training the model. Without its parameter values specified, a model is a structure, a shell, a high-level directive for transforming inputs into outputs. With the parameter values specified, the model can be used to generate specific predictions.
An algorithm is a recipe — a clearly defined step by step by step process — for turning a set of inputs into an output.
A prediction algorithm is the specific mathematical expression that results once the parameter values of the model are fixed. In machine learning these parameter values are fixed during the iterative training process which itself proceeds by using an algorithm. The most common of these training algorithms is called gradient descent.