We may sound pedantic when pointing we should be talking about Machine Learning, and not AI, for security threat detection use cases. But there is a strong reason why: to deflate the hype around it. Let me quickly mention a real world situation where the indiscriminate use of those terms caused confusion and frustration:
One of our clients was complaining about the “real Machine Learning” capabilities of a UEBA solution. According to them, “it was just rule based”. What do you mean by rule based? Well, for them, having to tell the tool that it needs to detect behavior deviations on the authentication events for each individual user, based on the location (source IP) and on the time of the event, is not really ML, but a rule based detection. I would say it’s both.
Yes, it is really a rule, as you have to define what type of anomaly (to the data field – or ‘feature’ – level) it should be looking for. So, you need to know enough about the malicious activity you are looking for, so you can specify the type of behavior anomaly it will present.
But on this “rule”, how do you define what “an anomaly” is? That’s where the Machine Learning goes. The tools will have to automatically profile each individual user authentication behavior, focusing on those data fields specified from the authentication events. You just can’t do it with, let’s say, a “standard SIEM rule”. There is real Machine Learning being used there.
But what about AI – Artificial Intelligence? ML is a small subset of a field of knowledge known as AI. But the problem is that AI has much more than just ML. And that’s what that client was expecting when they complained about the “rules”. We still need people to figure out those rules and write the ML models to implement them. There’s no machine capable of doing that – yet.
There have been some attempts based on “deep learning” (another piece of the AI domain), but nothing concrete exists. You can always point ML systems to all data collected from your environment so it can point to anomalies, but you’ll soon find out there are far more anomalies that are not related to security incidents than you are lead to believe by some pixie dust vendors. Broad network based anomaly detection has been around for years, but it hasn’t been able to deliver efficient threat detection without a lot of human work to figure out which anomalies are worth investigating.
Some UEBA vendors have decent ML capabilities, but they are not good on defining good rules/models/use cases to apply it. So, you may end up with good ML technology, but with mediocre threat detection capabilities, if you don’t have good people writing the detection content. For those going through the “build you own” path, this is even more challenging, as you need the magical combination of people who understand threats and what type of anomalies they would create and people who understand ML to write the content to find them.
Isn’t that just like SIEM? Indeed, it is. People bought SIEM in the past expecting to avoid the IDS signature development problem. Now they are repeating the same mistake buying UEBA to avoid the SIEM rules development problem. Do you think it’s going to work this time?