Human Interviewer, “On my desk is this photo of the night sky. Can you tell me how the universe was created?”
Renee Robot, “Of course, with a 20 megapixel camera.”
“If the Solution to the problem isn’t in the data, AI is useless.”
In mid 2020 I ran across an article headline reading “AI can’t predict a child’s future success, no matter how much data we give it: A Swing and a Miss.” The headline itself didn’t surprise me. What grabbed me is that anyone would think that it could. I was intrigued enough to take a look.
The article reported on an early 2020 effort where a small team of Princeton social scientists led a mass collaborative study that challenged 160 researcher teams to predict the life trajectories of a set of children given a large data set. Many of the researchers applied AI machine learning techniques to make their predictions.
This Princeton led collaborative study was conducted against a very robust six wave longitudinal data set collected over 15 years as part of the Fragile Families and Child Well Being study. The Fragile Families and Child Well Being study tracked thousands of families, in large cities, who had a child born around the year 2000. The purpose of the study was to observe and examine families formed by unmarried parents and study the lives of the children in these families.
In the Princeton led challenge the researchers were given the data from waves 1-5 (birth to age 9) as their background data. This data consisted of 12,942 variables about each of 4,242 families. The Princeton administrators of the challenge then chose six of the 1,617 variables from wave 6 (that data being withheld from the 160 research teams) as the targets for prediction. These six included things like child Grade Point Average (GPA), child grit, household material hardship, etc. So the 160 teams set out to predict these six selected criteria (out of 1,617) chosen from the wave 6 data as “success” criteria.
None of the predictions were accurate across all six success outcome variables. However, some observations such as the GPA of specific children were accurately predicted by all teams whereas some observations were poorly predicted by all teams. Despite various data processing and machine learning approaches, the predictions were similar across researcher teams.
Can you believe AI failed this measly little prediction task of accurately predicting six human beings’ life outcomes? Hopefully, my sarcasm is coming through loud and clear. As I suspected, this study has all the hallmarks of the misapplication of artificial intelligence including poorly designed outcomes, the wrong data and variable overload. I will use this example and others to illustrate the spectrum from what AI does well to what it can’t do at all.
AI Suitability is Primarily Determined by Four Factors
With the hype and ubiquity of AI, it seems that it can tackle just about any problem. Maybe theoretically, but in reality, AI is ill suited for many challenges. AI suitability runs on a spectrum from tight fit to pretty much impossible. And there are essentially four factors that determine where any problem lies on that spectrum. Yes there are many other factors to consider in successfully taking an AI initiative from start to finish. But in examining the core nature of AI, I see four primary factors that determine the “level of difficulty” of any particular problem (business, scientific, medical, sociological, etc.). Executives need to be able to quickly assess business problems on this spectrum and avoid applying AI to efforts on the impossible half of the spectrum. These four factors are;
- Known result
- Data availability
- Success probabilities
- Consequences of failure
Remember that here we are exploring the “applicability of AI” to business problems. This is just one part of an overall business case analysis and successful application. In other words, if the business problem is not worth solving it should never get to this AI applicability exploration. Also, great applicability means little in the face of poor AI execution.
One of the big benefits of the below model is its universality. It applies to literally any AI challenge. If you are a pharma giant trying to determine where to fit AI into your drug discovery process. It works. If you are a retailer trying to apply AI to figure out how to market better to customer segments, it works. If you are an application software technology provider trying to incorporate AI into your product, guess what, it works.
Figure 1 is a basic model to illustrate fundamental AI applicability. It rates the four factors as high, medium or low. This is adequate to assess AI applicability to some business problems. Others may be more nuanced and require a deeper dive into the four factors. Business executives who are so inclined can have their teams increase the complexity of the model commensurate with the difficulty of the analysis. They can decompose the four factors into sub parts that are relevant to the company’s industry and mission or to the particular business challenge at hand. In addition, they can substitute the high and low ratings with a numerical scale for each sub-part. And they can weight the sub-parts to capture relative importance. There are many opportunities to turn the model into a more robust aid for AI decision making.
Figure 1: AI Applicability Model
|Known Result||Data Quality||Success Probabilities||Consequences|
|High||Very clear problem statement with well understood outcomes.||Have high quality data that contains answers to the problem.||Acceptable rates of false negatives and false positives are known and expected results fall within those boundaries.||The consequences of failure are minimal.|
|Medium||Clear problem statement with loosely defined outcomes.||Have data but the quality relative to the problem statement is unknown or lacking.||Acceptable rates of false negatives and false positives are known but expected results are unknown.||The consequences of failure are moderate and can be mitigated.|
|Low||Problem statement and outcome are loosely defined.||Don’t have the data and must acquire it.||Acceptable rates of false negatives and false positives are unknown.||The consequences of failure are dramatic.|
Let’s examine several example AI use cases to illustrate this model.
High Value Use Cases are Defined Clearly with Good Data and Minimal Negative Consequences
It makes perfect sense that the high value use cases would be the most successful that we see today. Two of them include advertising and customer support.
Facebook runs thousands of algorithms against one of the world’s largest sociological data sets (facebook activity data) to target advertising to members of its social media platform. Delivering relevant ads to their members is a very clear problem statement with a single well defined outcome. They have a tremendous amount of activity data to train and refine their algorithms. Though they strive for high accuracy, the acceptable margin for error is relatively high. And the negative consequences of serving up an ignored ad is minimal. Although I’m using Facebook as an example, this use case is nearly pervasive in the world of ecommerce marketing.
Another prevalent and growing use case for AI is in customer support. Virtual call center agents are increasingly serving customers for basic needs like checking account balances, making a payment, opening a new account, renewing a subscription, filing an insurance claim, setting up appointments, basic tech support, etc. In these cases the problem is very well known and getting to the outcome is often driven by a defined process. The data requirement is minimal and often centers on structured existing customer or operational data. There is an AI challenge with understanding the customer’s request through a conversational user interface powered by natural language processing (NLP).
The state of NLP technology drives quality and error rates. Generally, the consequences are a frustrated customer. These consequences are mitigated by transferring the customer to a human representative as soon as the customer requests it or the system recognizes that the AI is not working for the customer. So this is between medium and high on the spectrum of AI’s potential to deliver value.
Medium Value AI Applications Have Broad Problem Statements with Ambiguous Outcomes
Organizations are beginning to use AI for HR recruiting support. The thought of AI substituting for a recruiter is viscerally discomfiting for many. However, there are bright spots. Companies are not going to hire an army of recruiters. So using AI can actually enable companies to review more candidates and provide opportunity exposure to more people. Using AI for early resume screening helps reduce the unmanageable number of resume submissions. From there, the human approach takes over.
This use case mostly falls within the medium value range for AI. The problem statement is well understood and is usually captured in a job description. The desired outcome is generally understood but the specifics are unknown. The resume data is available (i.e., resumes) but it must be sourced externally and the quality of that data varies significantly. Accuracy is difficult to gauge but the acceptable margin of error is relatively high especially when human screening is also part of the process. The negative consequences revolve around ineffective hiring, employment law infractions (illegal bias), and potential public scrutiny of hiring practices. These risks are relatively minor and are mitigatable.
Low AI Suitability Attempts to Solve High Risk Problems with Unsuitable Data and Poorly Defined Outcomes
There are examples of successful applications of AI in healthcare and the list will continue to grow. But so far, diagnosing and treating cancer is not one of them. Probably one of the most well known examples of an ineffective application of AI is IBM Watson for Oncology. Doctors rejected the system claiming it was giving bad advice that was potentially harmful to patients. This example is the epitome of a low success potential and high risk scenario. The problem set is very broadly defined and the outcomes are numerous and very complex. High quality real patient data was difficult to obtain. It was so difficult to manage that the Watson for Oncology team shifted to using hypothetical data. The rate of acceptable false positives and false negatives is very low and almost impossible to predict. Lastly, since the results impacted people’s health and could be a matter of life or death, the negative consequences of failure were very high.
The Fragile Families and Child Well Being effort we explored earlier is a good example of a low value application of AI. Remember that they were trying to predict a child’s future success based on acquired data that was not intended to address that problem statement. The problem statement of predicting child success is loosely defined and they didn’t have a clear understanding of the desired output. This lack of clarity made it difficult to know if the answer was in the acquired data. Determining accuracy expectations was also very difficult. However, the negative consequences were minimal so this application is not the lowest of the low.
Apply a Basic Litmus Test to Rapidly Reject Poorly Suited AI Projects
Business leaders can apply a basic tool like this as a litmus test. It can save them from pursuing efforts that will stunt the organization’s AI progress. It also starts to illuminate the particular accelerators and inhibitors to success. Business leaders can think of this screening as they would a set of people working in a back room on a potentially sensitive project. It is critical to ask the right questions to understand the risks.
Key business leadership questions to ask when exploring AI project potential.
- What does the right answer look like?
- Are we sure that the data we have will lead to the answer?
- How many variables are we looking at and what are the chances we will hit the right answer?
- What are the risks to our business if we get the answer wrong and are the rewards worth the risks?
At this point, hopefully I’ve provided some basic knowledge business leaders can use to ask the right questions, assess the answers and make better decisions on what AI projects to consider and which to shelve.
I will dive deeper into the four factors in subsequent posts. I’m always interested in hearing your thoughts so feel free.
The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.