A Gartner ‘Predict’, published in the Wall Street Journal this weekend (Fake It to Make It: Companies Beef Up AI Models With Synthetic Data), says: “By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated”. The Predict was shared by our very own Erick Brethenoux who leads our AI research.
This is a very important Predict because synthetic data has many uses. The article in this case focuses on fraud and I have already blogged on the use of synthetic data in healthcare; see Re-engineering the Decision – Our Storyline for Data and Analytics. Synthetic data could well touch every industry.
Synthetic data has a bright future if you think about it. The new set of normals (there may not be a single new normal for some time) organizations will experience going forward, including growth, risk, opportunity and stress, all the same time, triggered a need to re-think how executives and everyone else takes decisions. This is why our data and analytics storyline is focused on “re-engineering the decision“. For example, synthetic data can help you cope when decision making brake down when:
- Estimation or forecast models based on historical data no longer work
- Assumptions based on past experience fail
- Algorithms cannot reliably model all possible events due to gaps in real-world data sets
With judicial use synthetic data can help augment new efforts related to using new data sources such small and wide data; see Top Trends in Data and Analytics for 2021: From Big to Small and Wide Data.
But have you even heard of synthetic data? If you us or are familiar with digital twins you are in a related space. Digital twins are very structured synthetic (i.e. digital or data) substitutes or “doubles” of physical things used in a digital model. More broadly synthetic data helps expand and fill out data sets that have gaps or issues. Only just in January of this year some of our team published a piece of research on the topic: Maverick* Research: Forget About Your Real Data — Synthetic Data Is the Future of AI. Maverick research is not meant to be a defensible piece of advice built up on years and years of experience and analysis; it is a reach into the future to provoke reaction and new thinking. Many Maverick ideas die naturally; some come true. It could well be that synthetic data is coming to a decision of yours very soon!