There is a very important point to understand about the vendors using ML for threat detection.
Usually ML is used to identify known behavior, but with variable parameters. What does that mean? It means that many times we know what bad looks like, but not how exactly it looks like.
For example, we know that data exfiltration attempts will usually exploit certain protocols, such as DNS. But data exfiltration via DNS can be done in multiple ways. So, what we do to detect it is to use ML to learn the normal behavior, according to certain parameters. Things like amount of data on each query, frequency of queries, etc. Anomalies on these parameters may point to exfiltration attempts.
On that case ML helps us find something we already know about, but the definition is fuzzy enough that prevents us from using simple rules to detect it. This is an example of unsupervised ML used to detect relevant anomalies for threat detection. There are also many examples of using supervised ML to learn the fuzzy characteristics of bad behavior. But as you can see, a human had to understand the threat, how it operates, and then define the ML models that can detect the activity.
If you are about to scream “DEEP LEARNING!”, stop. You still need to know what data to look at with deep learning, and if you are using it to learn what bad looks like, you still need to tell it what is bad. We ended up at the same place.
Although ML based detection is a different detection method, the process is still very similar to how signatures are developed.
What haven’t been done yet is AI that can find threats not defined by a human. Most vendors use misleading language to lead people to think they can do it, but that doesn’t exist. Considering this reality, my favorite question to these vendors is usually “what do you do to ensure new threats are properly identified and new models developed to identify them?”. Isn’t that interesting that people buy “AI” but keep relying on the human skills from the vendor to keep it useful?
If you are a user of these technologies, you’ll usually need to know what the vendor does to keep what the tools looks for aligned to new threats. For the mature shops, you also need to know if the tool allows you to do that yourself, if you want/need.
That’s a good way to start the conversation with a “Cybersecurity AI” vendor; see how fast they fall into the trap of “we can find unknown unknowns”.