Here is a tricky problem to solve: how do we compare technical threat intelligence (TI) feeds?
First, a quick definition is in order. While I comply with Gartner overall definition of Threat Intelligence, here I wanted to limit the discussion to technical (sometimes called “tactical” or “operational”) TI such as feeds of IPs, DNS names, URLs, MD5s, etc [and, yes, I am well-aware of the fact that purists consider such feeds to be “threat data” and not “threat intelligence”, but please don’t kick me for this here].
In any case, how to tell that a $200,000 list of “bad” IP addresses is better than a $0 list of “bad” IP addresses?
Let’s think about it together:
|Proposed measure||Ease of getting it||Usefulness for the TI user|
|Number of entries||Easy – just count’em||Debatable – is more better? Or noisier?|
|Certainty of entry badness||Moderate– providers may not know due to algorithms, blind aggregation, lack of context, etc||Important, but not sufficient; even proven badness does not convey relevance to your environment|
|Type of entry badness||Varied – some feeds only cover certain types (like C&C or exfiltration)||Useful; needed to consume the data effectively|
|Additional context data, extended schema fields, etc||Easy – look at the data to see what context is provided||Useful, but as an auxiliary|
|Update frequency||Easy – ask the vendor or check the data||Useful as long as it can be utilized at the same speed|
|Frequency of matches with your IT environment||Hard – requires operational usage of TI data||Yes, this is what makes the feed relevant and “actionable”|
|Frequency of matches in your environment NOT connected to an ongoing investigation||Hard – requires operational usage and an active IR team||The most useful; this is actionable in the best possible way (but impossible to know in advance)|
|Frequency of “false positives”||Hard – requires operational usage||Useful in combination with the above|
|Popularity||Medium – need to ask the provider or peers about who else uses the data||Somewhat useful, but requires detailed interviews on usage with peers for maximum value|
|Durability – continued use for detection over time||Hard – requires operational usage for a long time||Yes, this is what makes the feed not just actionable, but reliably actionable|
|Preprocessing by the provider||Medium – need to ask the provider and trust the answer||Sort of – presumably feed provider processing makes it “better”, but how?|
|Exclusivity||Hard – no trustworthy way of getting it||Yes, but needs to be combined with some relevance metrics – if the provider TI feed has unique threats that don’t matter to you|
Keep in mind that the above is very, very, very raw and needs lots of refinement if not rework.
To conclude, most useful TI feed comparison metrics [in this list] require operational usage and thus cannot be easily utilized for feed purchase/integration decision. In other words, I haven’t solved the problem yet.
Effectively, we need a metric that servers as a proxy for “how much will this TI feed reduce my time to detect?” (detect faster) and “how much will this TI feed enable me to detect what I’d otherwise miss?” (detect better).
Finally, this conundrum made some organization say “We’ll just collect *ALL* possible feeds and build a local intel clearing operation.” This approach treats all TI feeds as “raw threat data” and then focuses on creating locally relevant threat intel out of the pile. That certainly works well, if you have the resources to do it.
Ideas? Thoughts? Experiences?
Huge thanks to Lenny Zeltser for super-helpful comments on this emerging framework!
Posts related to this research project: