One of the famous insults that security vendors use against competitors nowadays is “RULE – BASED.” In essence, if you want to insult your peers who, in your estimation, don’t spout “AI” and “ML” often enough, just call them “rule-based” 🙂
Sure, OK, we all can laugh at claims of “cyber AI” (and we do, often), but what is the reality layer under this? I suspect there is a spectrum that may be worth thinking about…
First, here is a Snort rule (source):
alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:”PUA-ADWARE Lucky Leap Adware outbound connection”; flow:to_server,established; content:”/gdi?alpha=”; fast_pattern:only; http_uri; content:”|0D 0A|Cache-Control: no-store,no-cache|0D 0A|Pragma: no-cache|0D 0A|Connection: Keep-Alive|0D 0A 0D 0A|”; content:!”Accept”; http_header; content:!”User-Agent:”; http_header; metadata:impact_flag red, policy balanced-ips drop, policy security-ips drop, ruleset community, service http; reference:url,www.virustotal.com/en/file/43c6fb02baf800b3ab3d8f35167c37dced8ef3244691e70499a7a9243068c016/analysis/1395425759/; classtype:trojan-activity; sid:30261; rev:7;)
Nobody sane will deny that this is “rule-based” threat detection; this is a NIDS signature. Same logic applies to tools that do threat intelligence (TI) matching to logs and traffic – even though TI is not exactly signatures.
The defining characteristics of a signature are (I think they are – us people with big egos often forget to add “I think” to positions):
- Focuses on “known bad”
- Describes specific badness
- Names the exact type of badness
- Latches on precise characteristics of badness behavior and/or nature (note behavior and/or nature part!)
- (anything else I missed?)
Now, how about this example:
IF Application_Protocol = FTP AND Destination_IP_Class=external AND Data_Transfer_Volume > 10MB THEN <ALERT>
Is this a rule? I’d say this is a rule, but probably not a signature. I think the essential characteristics are:
- Focuses on expected badness, but perhaps not on exact “known bad”
- Latches on broad characteristics of badness behavior and/or nature
Latching onto the precise nature of badness is gone.
OK, how about this?
IF Application_Protocol = FTP AND User_Group=admins AND Data_Transfer_Volume > 2*(Average_User_Peers) THEN <ALERT>
Still a rule, eh? The last example has referenced a metric Average_User_Peers that is presumably based on a running average (what we used to call it; now they just call it machine learning…). To me, the above is a rule, a pattern, or perhaps a rule-with-a-caveat. It is clear that we enter a fuzzy territory here. Purists start to cringe. Cyber AI appears.
What about a robot-written rule? Say some unsupervised ML logic reveals that FTP data transfers of larger than double the average among user peers are 77.3% likely to be malicious? We are well in a fuzzy territory here! Purists freak out. Cyber AI frowns at you. Is a algorithm-written rule a rule? Now we enter the very philosophical core of the fuzzy territory…
Finally, what about a supervised ML classifier trained on a vast corpus of badness (naturally, all “known bad”, by definition) and goodness. Few would claim this would be a rule, but admittedly this is related to “know bad” in some way, no? Cyber AI smiles at you.
I think the essential characteristics here would be:
- Focuses on badness similar in some mathematically measurable way to “known bad”; this is “derived from known bad”, rather than “known bad”
- Latches on characteristics of badness behavior and/or nature visible to an algorithm, but perhaps not to a human.
As we ponder further, another way to look at this is perhaps:
|Threat type||Method that works||Method that doesn’t|
|Known known||Signatures, supervised ML||N/A|
|Known unknown||Rules, supervised ML||Signatures|
|Unknown unknown||Praying 🙂||Rules and signatures|
Dragos excellent treatise on four detection types (“The Four Types of Threat Detection”) elegantly differentiates between “indicators” (here in this post called signatures) and “Threat Behaviors.”
The latter may ultimately be RULES as well, and there is nothing offensive about it. Recalling the Bianco’s pyramid of pain, the rule may apply at any level, from bad IPs and file hashes (very signatures) to TTPs and attacker tradecraft. Sometimes, rule-based [approach] rules!
- next time you get into a bar fight over “is this signatures or behavior?”, say YES and walk away
- there is nothing wrong with being rule-based, in many cases
- rules work at many levels of abstraction, and they are more resilient / less fragile at higher levels
- (anything else I missed?)
Vaguely related posts: