As I allude here, my long-held impression is that no true anomaly-based network IDS (NIDS) has ever been successful commercially and/or operationally. There were some bits of success, to be sure (“OMG WE CAN DETECT PORTSCANS!!!”), but in total, they (IMHO) don’t quite measure up to SUCCESS of the approach.
In light of this opinion, here is a fun question: do you think the current generation of machine learning (ML) – and “AI”-based (why is AI in quotes?) systems will work better? Note that I am aiming at a really, really low bar: will they work better than – per the above statement – not at all? But my definition of “work” includes “work in today’s messy and evolving real life networks.”
This is actually a harder question than it seems. Of course, ML and “AI” aficionados (who, as I am hearing, are generally saner compared to the blockchain types … these are more akin to clowns, really) would claim that of course “now with ML, things are totally different”, “because cyber AI” and “next next next generation deep learning just works.”
On the other hand, some of the rumors we are hearing mention that in noisy, flat, poorly managed networks anomaly detection devolves to … no, really! … to signatures and fixed activity thresholds where humans write rules about what is bad and/or not good.
Before we delve into this, let’s think about the meaning of the term ANOMALY. In the past, “anomaly-based” was about silly TCP stack protocol anomalies and other “broken packets.” Today it seems that the term “anomaly” applies to mathematical anomalies in longer-term activity patterns – and not merely packets like in the 1990s.
So, will it work? This cannot really be answered without asking “work to detect what?”
Let’s go through a few examples we are hearing about:
- C2/C&C connection from malware to an UNKNOWN [for known, signatures and TI work well, no need to ML it] piece of attacker infrastructure – this was reported to work by some people, and it is not a stretch to imagine that anomaly detection can work here, at least some of the time
- Connection to some malicious domain [UNKNOWN to be bad at detection time, see above] – DGA domain detection is now baby’s first ML, so it does work [with some “false positives”, but then again, this is a separate question]
- Internal recon such as a port scan – it works, but then again, this is probably the only thing where the old systems also worked [but with false alarms too]
- Stolen data exfiltration by an attacker – we’ve heard some noises that it may work, but then again – we’ve heard the same about DLP. IMHO, the jury is still out on this one… Let’s say I think anomaly detection may detect some exfiltration some of the time with some volume of “false positives” and other “non-actionables”
- Lateral movement by the attacker – the same as above, IMHO, the jury is still out on this one and how effective it can be in real life. I’d say we’ve heard examples where it worked, and some where it was too noisy to be useful or failed outright.
Apart from that, I’ve seem some naïve attempts to use supervised ML to train systems to learn good/bad traffic in general. IMHO, this is a total lost cause. It worked brilliantly for binaries (pioneered by “Vendor C”, for example), but IMHO this is 100% hopeless for general network traffic.
Finally, if the above detection benefits do not materialize for you, we are back in the “dead packet storage” land (albeit with metadata, not packets).
Posts related to this research: