Blog post

SIEM and Badness Detection

By Anton Chuvakin | July 24, 2014 | 0 Comments


A long time ago, in a galaxy far far away … at the very dawn of my security career I attended a presentation by somebody who is now a notable incident response expert. Well … who am I kidding? He was a notable IR expert back in 2000, way…way before IR was cool and way before the word “APT” entered common usage. In any case, I don’t recall much from the presentation apart from one point he made: he has never seen a significant intrusion detected by an intrusion detection system (IDS) [another example of the same kind can be found here]. That line has been burned into my brain since that day…

We routinely talk about prevention/detection/response mantra, [which, some people, for some strange reason, hear as prevention/prevention/prevention as if the room is noisy or something …but I digress], but industry research often reminds that that we really suck at detection [BTW, I find calls for “more prevention” to solve this problem to be sheer idiocy].

Still, “deploying a SIEM – as with any detection technology – will result in things being detected. After things are detected then someone will need to respond to it to investigate it.” (source) This post includes a structured look at SIEM detection methods and approaches. By the way, this post explicitly talks about the THREAT DETECTION, which implies near-real-time observation, as opposed to THREAT DISCOVERY, which involves digging out traces of threats that persist in your environment. Threat discovery is a very fun topic, and we can talk about it again later.

First, I have to repeat something I think I mentioned a few times over the years: SIEM is not an old-style HIPS that matches vendor-provided character sequences to logs. Well, you can use it as such, for sure. But SIEM’s ability to normalize, enrich with context (users, assets, vulnerabilities, etc), correlate across log sources, apply algorithms to streams and “pools” of data, and visualize the data for exploration makes it a different technology – and one with much more difficult mission than a 1997 HIPS.

Here is my quick summary of SIEM detection methods in use today, with select pros/cons of each [NOT a comprehensive list – a longer table may show up in a future paper of mine].

SIEM Detection Method Pros Cons
Human analyst event stream review An analyst observes a filtered stream of events in the console
  • None :–)
  • Does not scale
  • Skilled analyst required
Simple log matching rules “HIPS mode”: if I see string X123 in logs, alert
  • Simple
  • Specific
  • Light on SIEM resources
  • Need to know what to match
  • Useless for advanced, multi-stage attacks
Vendor-provided cross-device correlation rules Vendor-provided / default/ OOB correlation rules
  • Cross-device correlation
  • No need to write rules
  • Relevance to customer use cases may be lacking
  • Need to tune the rules
Matching events to threat intelligence feed Match incoming events to collected threat intel data such as “bad” IPs, domains, etc
  • Useful detection with minimal tuning
  • Low FPs [given quality TI]
  • Requires high-quality TI data
  • Timing: TI data needs to be loaded before the event
Log to context matching via rules Match incoming events to context such as user role (user with role X should never do Y, etc)
  • Easy policy alerts
  • Site-specific content
  • Need a clear policy
  • Context data needs to be loaded and be current in SIEM
Custom-written stateful correlation rules The ultimate in SIEM detection for years, custom correlation rule enable many scenarios and use cases
  • Targeted to what the organization needs
  • Refine and adapt over time
  • Rules need to be written and refined by a SIEM content expert
  • Errors in rule logic often not obvious
Real-time event scoring Algorithms to assess event attributes (source, type, time, other metadata) to highlight events of interest
  • Easy way to highlight potentially interesting events
  • Prioritization may not match your priorities
  • “Potentially” interesting
Statistical algorithms on stream data Statistics such as average, standard deviation, skew and kurtosis [yes, really!]
  • Useful complement to rules
  • Can be used with rules to look beyond single events
  • Choosing meaningful stats is often harder than writing rules
  • FPs are common
Baseline comparisons Compare event streams to historical baselines and metrics; related to stat methods, but uses stored historical baselines
  • Useful complement to rules
  • Can be used with rules to look beyond single events
  • Fails to detect when baseline includes badness, or attack traffic is not anomalous
  • FPs are common

Note that this is not about the data sources, but about the methods themselves – they can apply to many/all data source combinations. Also, the use of context data (users, assets, application, data, vulnerabilities, etc) is useful to enrich many detection methods as well as improve their accuracy. Next I suspect I need to talk about the data sources enabling various types of detection…

As with other functionality, there is probably a maturity curve here somewhere (here!). Who will know how to create statistical models if he never created basic SIEM rules?

P.S. All of these method, separately and together, will fail once in a while. Two choices you have then:

  1. Wait for the threat to manifest visibly – then go to security incident response.
  2. Go and dig for threats; do threat discovery.

Select recent SIEM blog posts:

Comments are closed