This is a short introduction to a research note we published recently on Causal AI, which is accessible here: Innovation Insight: Causal AI.
Correlation is not causation
“Correlation is not causation” is often mentioned, but rarely given the importance it deserves on AI. Correlations are how we see variables moving together in the data, but these relationships are not always causal.
We can only say that A causes B when an intervention that changes A would also change B as a result (whilst keeping everything else constant). For example, forcing a rooster to crow won’t make the sun rise, even if the two events are correlated.
In other words, correlations are the data we see, whereas causal relationships are the underlying cause-and-effect relationships that generate this data (see image below). Crucially, the data we typically work with exists in a complex web of correlations that obscure the causal relationships we care about.
Despite their notable success, statistical models, including those in advanced deep learning (DL) systems, use surface-level correlations to make predictions. The current DL paradigm doesn’t drive models to uncover underlying cause-and-effect relationships but simply to maximize predictive accuracy.
Now, it is worth asking: What is the problem of using correlations for prediction? After all, in order to predict, we just need enough predictive power in the data, regardless of whether it comes from causal relationships or statistical correlations. For instance, hearing a rooster crow is useful to predict sunrises.
The core problem lies with the brittleness of the predictions. For correlation-based predictions to remain valid, the process that generated the data needs to remain the same (e.g., the roosters need to keep crowing before sunrise).
There are two fundamental challenges with this correlation-based approach:
Problem #1: We want to intervene in the world
Prediction is rarely the end goal. We often want to intervene in the world to achieve a specific outcome. Anytime we ask a question of the form “How much can we change Y by doing X?”, we are asking a causal question about a potential intervention. An example would be: “What would happen to customer churn if we increased a loyalty incentive?”
And the problem with correlation-based predictive models, like Deep Learning, is that our actions are likely to change the data-generation process and therefore the statistical correlations we see in the data, rendering correlation-based predictions useless to estimate the effect of interventions.
For instance, when we use a churn model (prediction) to decide whether or not to give a customer a loyalty incentive (intervention), the incentive affects the data that generated the prediction (we hope the incentive makes the customer stay). In this case, causality really matters, and we can’t simply use correlations to answer questions on what would happen if we took an action (we need to run controlled experiments or use causal techniques to estimate the effects)
Problem #2: The data generation process changes frequently
Even without our intervention, the context in which AI is deployed changes all the time, rendering previously useful correlations useless for prediction in the new environment. A recent example is how many house-pricing AI predictive models worked well during normal economic conditions, but then deteriorated as the real-estate market changed.
For AI to be implemented widely, it needs to be more robust and trustworthy than current Deep Learning models are. Causal relationships are typically much more foundational and change more slowly than statistical correlations. At the extreme, we can leverage causal models including variables about how the Earth rotates and how it is tilted with respect to the Sun to make robust predictions on when the sunrise will happen in different locations (without having to listen to any roosters!). Crucially, causal models don’t need to be this sophisticated to be useful – humans intuitively use simple causal models all the time (e.g., a ball falls if we drop it).
For these reasons, we argue in our Innovation Insight: Causal AI note that AI needs to go beyond correlation-based Deep Learning predictive models and towards more robust AI systems that can prescribe actions more effectively (see image below).
We call this space Causal AI, and it includes a wide variety of techniques, like causal graphs, that help uncover and apply causal relationships to improve decision making. We invite you to explore this note and available research on this upcoming AI trend.
The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.