Blog post

Can I Detect Advanced Threats With Just Flows/IPFIX?

By Anton Chuvakin | July 21, 2016 | 6 Comments

threat intelligencesecuritynetwork forensicsmonitoring

Source IP. Destination IP. Source port. Destination port. Network protocol. Connection time. A bit more context data.

Is this enough to detect “an advanced threat”? Before you jump to conclusions, let’s have a productive discussion here. Some context is required to make it just such a discussion.

Here is where it started:

First, this is NOT a discussion on whether netflow-type data is useful for “cyber” (it is!); it is also NOT a discussion on whether it is the best way to detect said threats (it isn’t, having full network and full endpoint data is). However, it is a discussion on whether having only shallow header data and no payload (no application layer / L7, no HTTP, no DNS details, no raw PCAP, etc) – and NO endpoint data gives us a decent shot – a shot worth taking, essentially. Is netflow a shot worth taking?

Second, what is advanced? We should discuss it, ideally, but I’d rather not. Think Duqu 2.0, Flame, Stuxnet, better Easter European cybercrime wares, and of course the original APTs (example). So, I am setting the bar a bit higher than just “crap that your silly AV misses.”

So, the arguments IN FAVOR:

  • Flows can give internal network visibility (think lateral movement, internal recon, staging, etc) that is often impossible to get with logs and hard to get with full traffic capture (need too many capture points)
  • Flow information is (?) in fact sufficient in some cases (“what do you mean 123GB of DNS traffic left our network tonight?”)
  • Flow information processed by some magical (ML or otherwise) increases otherwise low information density of flow data and can lead to great insights and detections
  • If you happen to possess a surprisingly high level of awareness of what is normal on your network (such as on OT, ICS / SCADA, etc networks), flow is all you need
  • If you accept a high rate of false positives, an occasional flow-based insight may prove valuable and difficult to obtain by other means (or: only obtainable from more onerous methods such as full traffic capture)
  • Flows matched with good threat intel will lead to useful detections (an easy counter: real advanced threats do not show up in TI feeds)

The arguments AGAINST:

  • Flows never produce enough certainty to give a credible conviction (bad/ not bad) for any activity; many “this is certainly bad” activities often end up being legit (yes, our IT today is that weird)
  • To detect with flows, you need to have a decent idea of what is normal and this is often (always?) impossible on messy real-world networks – and it changes too rapidly
  • Not having layer 7 stuff (user-agent, HTTP response code, mail subject FTP username, etc) is a death knell for detecting modern application layer attacks
  • A rate of false detection will be too high, no matter what algorithm is used to process the flow streams; you need L7 – a real advanced threat will hide from L3 (=flow) detection among the legit traffic
  • Flows are great to validate what you detected by other means (“ah, so they connected to X after their browser exploit worked”), but not as primary/initial means of threat detection
  • Lack of user awareness really kills it; real advanced attacks use legit credentials and flows are useless for that.

There you have it – got anything useful to add? Use comments or twitter thread …

P.S. This is a random fun discussion, not aimed at any particular vendor, etc

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed

6 Comments

  • Hello,

    at Redsocks we exclusively use IPFix and we outperform competitors. So I guess our answer would be a resounding “YES!”.

    The analogy here is that we can detect particles like the Higgs boson by the side effects of it’s interaction with easier to detect particles and systems (i.e. electrons and magnetism).

    Now, if you think hard about it, you will see that all applications (and all malware) not only have specific destinations, but also specific patterns, packet sizes and rythms of communication.

    On top of all that, before a single data packet is sent to a remote destination, a lot of housekeeping has to be done (dns resolving, route determination, syn/ack etc…)

    The really good thing is that all of the above also works for end-to-end encrypted data (and a lot of it even works if the threat is trying to obfuscate itself)

    What I find funny in this article is that you are still talking about old style detection. I am not going to disclose what proper current technology is online, but come by and I will gladly enlighten you.

    TBH. From what I read, you are about 2 years behind….

    With kind regards,
    Adrianus Warmenhoven

    P.S. I help in developing the system, together with Dr. Rick Hofstede who got his promotion precisely this subject: https://scholar.google.com/citations?user=_GyDVoMAAAAJ&hl=en

  • Jason Smith says:

    Is this enough to detect “an advanced threat”?
    – Probably not consistently on its own, but I don’t really think many orgs are deploying it on its own as a means of being the primary source of threat detection. The power of flows is in the ability to gain insight and pivot off of them, to use them as a catalyst that opens up other paths that you can go down with other available data types (when more context is needed).

    Is netflow a shot worth taking?
    – Definitely! Aside from a bit of fairly low-end hardware, it is free to set up and can be utilized by both secops and netops for an unsampled understanding of the network. People seem to only consider vendors when thinking about flows despite tools like SiLK, Argus, and NFdump having existed for quite a long time. These tools sometimes have a moderately steep learning curve for the average analyst, but other tools like FlowBAT (free) for SiLK and NFsen (free as well) for NFdump exist as front ends to both greatly reduce the learning curve and provide many additional features for accessibility and analysis.

    The moral of the story is that if you don’t have anything set up already, setting up flows can be a big win even if it is only from a situational awareness point of view, and if you already have other data types, flows are a bigger win in that they will give you directions to the not-so-obvious.

  • Jin Qian says:

    I think netflow data will help, but as mentioned already, it’s not enough to help detecting malware traffic in many cases.

    To get an idea on how far it is from being really effective, we only need to think how it can be used to differentiate malware traffic from normal traffic. For many types of malware who stay low and slow (instead of generating 123GB of DNS traffic overnight 🙂 ), netflow data abstract away many vital clues that can be used to tell bad traffic from good. Given that hackers are increasingly more sophisticated, we will need to gather more information (not less info as in netflow record).

    By the way, I wonder if there are some netflow related algorithms described in plain English.

  • Marc says:

    You can put L7 info in an Ipfix tuple making it far more powerful than you state.

  • jg3 says:

    The benefit of IPFIX (or “Flexible NetFlow” in Cisco parlance) over NetFlow is that you can add any of the additional datapoints you find valuable. L7 application, DNS hostnames, UserAgent, certificate info, username, and so on. Several tools do this. Some make use of this data better than others.

    One of the huge NetOps gains flow provides is the ability to see the path of a flow as it traverses the network. An advantage commercial tools have over freeware is the ability to de-duplicate the flow data while retaining the hop-by-hop info. Beyond that, the differences are around the out of the box algorithms and the ability to create and apply your own.

    Full disclosure, the company I work for sells a product in this space. Maybe the best, maybe not; but I have used flow for security monitoring for 12+ years. It is not a panacea, but far better than it was just 4 years ago.

  • Thanks a lot of great comments and the discussion — sorry I was not able to answer to every comments as I am under some deadline. However, I really, really appreciate the discussion!