Anton Chuvakin

A member of the Gartner Blog Network

Anton Chuvakin
Research VP
2+ years with Gartner
14 years IT industry

Anton Chuvakin is a research VP at Gartner's GTP Security and Risk Management group. Before Mr. Chuvakin joined Gartner, his job responsibilities included security product management, evangelist… Read Full Bio

On MSSP SLAs

by Anton Chuvakin  |  October 23, 2014  |  Submit a Comment

Is 15 minutes a mere instant or an eternity? Is getting an alert 15 minutes after it was first generated fast enough? And the opposite question: is 15 minutes of MSSP-side alert triage enough to make sure that the alert is relevant, high-priority and high-fidelity? Indeed, spending too little time leads to poor quality alerts, but spending too much time on quality alerts leads to the attacker achieving their goals before the alert arrives and is acted upon.

So, yes, I did speak with one MSSP client who said that “15 minutes is too late for us” and another who said that “an MSSP cannot do a good job qualifying an alert in a mere 15 minutes” (both quotes fictional, but both “inspired by a real story”).

The answer to this – frankly not overly puzzling – question is again security operations maturity. On one end of the spectrum we have folks who just “don’t do detection” and rely on luck, law enforcement and unrelated third parties for detection (see this for reference). On the other, we have those with ever-vigilant analysts, solid threat intel and hunting activities for discovering the attackers’ traces before the alerts even come in.

As we learned before, security chasm is very strong in this area.

Therefore, a meaningful MSSP SLA discussion cannot happen without the context of your state of security operations.

For example, if you …

  1. … have no operation to speak of and plan to hire an intern to delete alerts? You can accept any alert SLA, [SAVE MONEY!!! GET YOUR ALERTS BY SNAIL MAIL! CARRIER PIGEON OK TOO! :-)] whether it is at the end of the day, or even a week. If you have no plan to ever act on a signal, a discussion of the timing of action is senseless.
  2. … can act on alerts when really needed, and will probably scramble a response if something significant happens? Look for a few hours or similar timing, and limit alerts to truly critical, “incident-ready” ones.
  3. … have a defined security monitoring/response function that is equipped to handle alerts fast? Aim at up to an hour for significant alerts and others maybe at the end of the day.
  4. … possess a cutting-edge security response operation? Push your MSSP partner to 15 minutes or less – for the best chance to stop the attacker in his tracks. Set up a process to review and process alerts as they come, and refine the rules on the fly. Respond, rinse, repeat, WIN!

The key message is: you don’t want to pay for speed that you won’t be able [or don’t plan] to benefit from. If security alerts will sit in inboxes for hours, you don’t need them delivered in minutes.

Now, what about the SLAs for various management services, such as changing NIDS rules and managing firewalls? SLAs play a role here as well, and – you guessed it – what you need here also depends on the maturity of your change management processes… Some people complain that an MSSP is too slow with updates to their security devices, while others know that MSSP does it faster than they can ever do it.

Blog posts related to this research on MSSP usage:

Submit a Comment »

Category: incident response monitoring MSSP security     Tags:

Acting on MSSP Alerts

by Anton Chuvakin  |  October 16, 2014  |  4 Comments

Have you seen the funnel lately?

funnel

(source: https://flic.kr/p/fxKbT)

In any case, while you are contemplating the funnel, also ponder this:

what do you get from your MSSP, ALERTS or INCIDENTS [if you are getting LOGS from them, please reconsider paying any money for their services]

What’s the difference? Security incidents call for an immediate incident response (by definition), while alerts need to be reviewed via an alert triage process in order to decide whether they indicate an incident, a minor “trouble” to be resolved immediately, a false alarm or a cause to change the alerting rules in order to not see it ever again. Here is an example triage process:

old-triage

(source: http://bit.ly/1wCI0dt)

Now, personally, I have an issue with a situation when an MSSP is tasked with declaring an incident for you. As you can learn from our incident response research, declaring a security incident is a big decision made by several stakeholders (see examples of incident definitions here). If your MSSP partner has a consistent history of sending you the alerts that always lead to incident declaration (!) and IR process activation – this is marvelous. However, I am willing to bet that such a “perfect” track record is achieved at a heavy cost of false negatives i.e. not being informed about many potential problems.

So, it is most likely that you get ALERTS. Now, a bonus question: whose job is it to triage the alerts to decide on the appropriate action?

thinking

(source: https://flic.kr/p/6wdLat)

Think harder – whose job is it to triage the alerts that MSSPs sends you?

After you figured out that it is indeed the job of the MSSP customer, how do you think they should go about it? Some of the data needed to triage the alert may be in the alert itself (such as a destination IP address), while some may be in other systems or data repositories. Some of said systems may be available to your MSSP for access (example: your Active Directory) and some are very unlikely to be (example: your HR automation platform). So, a good MSSP will actually triage the alerts coming from their technology platform to the best of their ability – they do have the analysts and some of the data, after all. So, think of MSSP alerts as of “half-triaged” alerts that requires further triage.

For example:

  • NIPS alerts + firewall log data showing all sessions between the IP pair + logs from an attack target + business role of a target (all of these may be available to the MSSP) = high-fidelity alert that arrives from the MSSP; it can probably be acted upon without much analysis
  • NIPS alerts + firewall log data (these are often available to the MSSP) = “half-triaged” alerts that often need additional work by the customer
  • NIPS alerts (occasionally these are the only data available) = study this Wikipedia entry on GIGO.

A revelation: MSSPs are in the business of … eh… business. So, MSSP analysts are expected to deliver on the promise of cost-effectiveness. Therefore, the quality of their triage will depend on the effectiveness of their technology platform, available data (that customers provide to them!), skills of a particular analyst and – yes! – expected effort/time to be spent on each alert (BTW, fast may mean effective, but it may also mean sloppy. Slow may mean the same…)

Another revelation: MSSP success with alert triage will heavily depend on the data available to their tools and analysts. As a funny aside, will you go into this business: I will send you my NIDS alerts only (and provide no other data about my IT, business, etc) and then offer to pay you a $1,000,000 if you only send me the alerts that I really care about and that justify an immediate action in my environment. Huh? No takers?

So, how would an MSSP customer triage those alerts? They need (surprise!):

  • People i.e. security analysts who are willing and capable of triaging alerts
  • Data i.e. logs, flows, system context data, business context, etc that can be used to understand the impact of the alerts.

The result may look like this:

MSSP-flow

Mind you that some of the systems that house the data useful for alert triage (and IR!) are the same systems you can use for monitoring – but you outsourced the monitoring to the MSSP. Eh… that can be a bit of problem :-) That is why many MSSP clients prefer to keep their own local log storage inside a cheap log management tool – not great for monitoring, but handy for IR.

Shockingly, I have personally heard about cases where MSSP clients were ignoring 100% of their MSSP alerts, had them sent to an unattended mailbox or hired an intern to delete them on a weekly basis (yup!). This may mean that their MSSP was no good, or that they didn’t give them enough data to do their job well… As a result, your choice is:

  • you can give more data to an MSSP and [if they are any good!] you will get better alerts that require less work on your behalf, or
  • you can give them the bare minimum and then complain about poor relevance of alerts (in this case, you will get no sympathy from me, BTW)

And finally, an extra credit question: if your MSSP offers incident response services that costs extra, will you call them when you have an incident that said MSSP failed to detect?! Ponder this one…

Blog posts related to this research on MSSP usage:

4 Comments »

Category: monitoring MSSP security     Tags:

MSSP Client Responsibilities – What Are They?

by Anton Chuvakin  |  October 9, 2014  |  1 Comment

Let me tell you a secret: MSSP is not a box that you throw your money in, and security comes out screaming! Sadly, many would say that the only reason they went with a Managed Security Service partner is to avoid doing any security on their own. However, if you decided to go with an MSSP and not with an in-house capability (such as internally-staffed SOC with SIEM tool at the center) …


… YOU STILL HAVE RESPONSIBILITIES!

This post is an attempt to outline my thinking about such responsibilities and create a structured approach to analyzing them. Intuitively, there are some things that an enterprise MUST do to allow the MSSP to help them (e.g. deploy their sensors, give them credentials for device management, etc). Still, there are more responsibilities that allow the MSSP to help the client better.

In any case, think of this table NOT as a comprehensive list, but as a framework to organize examples:

Value | time –> During on-boarding / before service During MSSP service consumption
To enable service delivery (MUST) Deploy sensors, share network diagrams and access credentials, provide contacts, etc Notify on asset and network changes, access changes, contact info, etc
To enable maximum value from the MSSP
(SHOULD)
Refine & share a security policy, have IR plans, provide detailed asset and context info, etc Respond to alerts (!), remediate systems, declare incidents and run IR, jointly tune the alerts, communicate changing security priorities, etc

An expanded version of this type of a visual should become your shared responsibility matrix, that will actually enable you to benefit the most from your MSSP relationship. BTW, one MSSP succinctly states in their policies: “The Customer is responsible for all remediation activities.” What about compliance, you may ask? An excellent question – to be handled in the next post :-)

P.S. Of course, there will be people who will insist that “if you want it done well, do it yourself” (that may be true, but it does not mean this route is always the most cost-effective). On the other hand, there will be people who will say “… but security is not our core competence” (eh.. as if locking the doors is)

Blog posts related to this research on MSSP usage:

1 Comment »

Category: monitoring MSSP security     Tags:

Critical Vulnerability Kills Again!!!

by Anton Chuvakin  |  October 6, 2014  |  2 Comments

A killer vulnerability KILLS AGAIN!!! Another “branded vulnerability” – Shellshock – is heeeeere! Run for the hills, escape the planet, switch to a “secure OS” (Windows 3.1 fits the bill), stop the cyber, etc, etc, etc.

<insert all the obligatory World War I references to shell shock and jokes about being bashed by bash> :-)

However, this post is not about Shellshock with a “perfect 10.0” in CVSS Base – at least not directly.

Sure, if you have not patched yet – stop reading this now. Deploy a patch to bash – focus the remediation on the Internet-visible servers first (some of our clients set a reasonable 1 hour patching timeline for this one – as in “patch all exposed systems within 1 hour of patch release.” Eat this, folks that take 90 days to patch!). Scan your servers for the vulnerability to know how exposed you are (if at all), and do not limit the scanning to the Internet-visible sites since having this issue on the internal servers makes the attacker’s job easier. Note that an authenticated scan will show that you are vulnerable on all Unix/Linux servers, but will NOT show where you are exploitable, while an unauthenticated scan will not show all the exploitation avenues (a great case study for the limits of modern VA technology). Some people have temporarily changed shells (tcsh is still alive!), thus breaking many scripts, and deployed NIPS and WAF rules tactically. Do all that, sure. Others have used this as an opportunity to remove the – frankly, idiotic! – shell scripts from public /cgi-bin directories and do other tightening of their infrastructure.

All in all, I think Shellshock is not even in the ballpark of Heartbleed (others disagree): with that baby, pretty much the entire SSL-using Internet was vulnerable and exploitable. Here with Shellshock we have a relatively small population of remotely exploitable systems (early evidence pointed at exploitable sites numbering in thousands, not millions). Sure, the impact (easy remote access by an attacker) is worse, but much fewer sites are exploitable.

But did I say this post is not about Shellshock? Ah, thanks for paying attention! It is not…

When I started being involved with infosec (which feels like a moment or an eternity ago, depending on the situation) by helping out with some Linux boxes at a small ISP, a wise mentor told me: Anton, don’t be stupid, don’t make your security solely dependent on not having any exploitable holes. Back then, IIS had dozens of exploitable remotes, while Apache was considered “secure” – and this is what we used. Still, the infrastructure was set-up in such a way that a remote exploit in Apache that gives you shell as “nobody” combined with one of many locals that escalates you to “root” meant “GAME OVER.” The attacker would have been able to destroy the entire business in about 20 minutes, for good [there were no offline backups – this is 1999 we are talking about]. So, that led to some major rethinking…

In any case, WHY IS THIS NEWS TO SOME PEOPLE NOW IN 2014?!!

So, as a reminder for most and as news for some: do not make your security architecture solely reliant on patching. Big vulnerabilities will happen and so will zero-days, so make sure that your entire security architecture does not crumble if there is one critical vulnerability: do defense in depth, layers, “least privilege”, controls not reliant on updates, monitoring, deception, etc. The fact that they have a 10.0 remote should NOT mean that you automatically lose everything!!

Miscellaneous fun posts:

2 Comments »

Category: patching philosophy security vulnerability management     Tags:

Security Planning Guide for 2015

by Anton Chuvakin  |  October 3, 2014  |  Comments Off

Our team has just released our annual security planning guide: “2015 Planning Guide for Security and Risk Management.” Every GTP customer should go and read it!

Its abstract states: “In planning for security and risk management projects for 2015, organizations must scale and adjust their risk management and security practices to satisfy current and future IT and business needs.”

Here are a few fun quotes:

  • “Risk management programs often haven’t scaled, which increased reliance on traditional security patterns — so which should change first? In other words, if security and compliance are indeed falling farther behind, with compliance in particular remaining deeply entrenched in tradition, how can we even begin to adopt new security patterns?”
  • “Use threat assessment and attack models as part of risk assessment and mitigation to determine which controls should be considered. The attack model helps identify what set of controls is necessary to cover various attack stages, channels and target assets. ”
  • “Architect for microperimeterization where the network security boundary shrinks to the host level or smaller. Because perimeters will have to become more dynamic, security will need to be split among the moving parts and pieces.” (perimeter is NOT dead, it is just different…)
  • “Loss of control and visibility will continue in the Nexus of Forces, with mobility and cloud leading the way. But with compliance still often equating security to having control, this leads to challenges in adoption of these now not-so-new technologies.”
  • “Logging and monitoring of privileged activity is also key when the lines are blurred between compute, storage, network and security administration. At a minimum, monitoring must enable reporting and post hoc investigations of events; this paves the way for adding real-time analytics, alerting and enforcement later on.”

Much of the stuff in our doc is, of course, not new, but has been highlighted as important by recent events. Also, some things – while not truly new – may be new to some organizations that are just waking up to the needs of information security (or “cyber“, if you have to call it that)

Past guides from GTP SRMS team (i.e. us):

Comments Off

Category: announcement security     Tags:

My Top 7 Popular Gartner Blog Posts for September

by Anton Chuvakin  |  October 1, 2014  |  Comments Off

Most popular blog posts from my Gartner blog during the past month are:

  1. SIEM Magic Quadrant 2014 Is Out! (announcements)
  2. Detailed SIEM Use Case Example (SIEM research)
  3. Popular SIEM Starter Use Cases (SIEM research)
  4. Challenges with MSSPs? (MSSP research)
  5. Named: Endpoint Threat Detection & Response (ETDR research)
  6. On Comparing Threat Intelligence Feeds (threat intelligence research)
  7. How To Work With An MSSP Effectively? (MSSP research)

Enjoy!

Past top posts:

Comments Off

Category: popular security     Tags:

Find Security That Outsources Badly!

by Anton Chuvakin  |  September 27, 2014  |  6 Comments

In this post, I wanted to touch on a sensitive topic: what security capabilities outsource badly? Keep in mind that this post is Anton contemplating a topic, not a Gartner research position (BTW, I don’t slap this disclaimer on every post, but I feel that it is strangely appropriate here)

Let’s start: whole lot of companies would take on your perimeter NIDS/NIPS monitoring and management, but much fewer will do content-aware DLP using the same model. Think about this: there are very few managed DLP providers and even fewer managed network forensics (NFT) providers. Why is that?

Here is how I think about it (pardon my gross over-simplification here, but it serves the purpose):

Defense = know what to defend + know how to defend

(see On “Defender’s Advantage” for a longer discussion)

In more detail:

  1. know what to defend = your IT environment, business processes, assets, systems, application, personnel, company culture, mission and other knowledge of your IT, business and culture
  2. know how to defend = understanding threat actors, attacks methods, exploits, attacks, vulnerabilities, security architecture and other security domain knowledge.

To not completely suck with security [and we are talking about the very, very, very basics here], you need to have some idea of what to protect and some on how to do it. However – and this is the punch line! – the balance between #1 knowledge (about the lay of the land) and #2 knowledge (about techniques and methods of infosec) varies dramatically across different domains of infosec.

Intuitively, we all get it: anti-malware kills viruses without any requisite knowledge of your environment, while using a SIEM effectively requires a lot of it. Further, detecting insider fraud requires knowledge of how your business functions and how your people behave. And don’t even get me started on business logic flaws in web applications: to find business logic flaws you do need to know the logic of your business … duh!

So, answer this one – think of two security capabilities:

  • security capability A requires 90% of #2 knowledge (security domain knowledge) and 10% of #1 knowledge (your environment)
  • security capability B requires 90% of #1 knowledge (your environment) and 10% of #2 knowledge (security domain knowledge)

Which one will outsource better? OK, you got this one :-)

Firewall configuration, anti-malware (whether AV or MPS), perimeter NIDS/NIPS, threat intelligence heavily rely on security domain knowledge and less on the knowledge of your IT and business. DLP (especially data discovery or DAR DLP), network forensics (NFT) for internal networks, user behavior monitoring require an incredible amount of “site knowledge” (some written and much unwritten and thus only present in some peoples’ heads). Security incident response presents a peculiar example: IMHO it requires a delicate balance of both (so when the IR ninja paratroopers drop in, they will require support from the indigenous forces aka your IT and BU personnel – otherwise the attacker wins again).

Where am I getting with this?

You can go to an MSSP, you can get consultants to help you, you can do staff augmentation, you can ask Gartner — but for some security capabilities that critically rely on the knowledge of your environment, you have to also play the game yourself!

Blog posts related to this research on MSSP usage:

6 Comments »

Category: MSSP philosophy security     Tags:

My UPDATED “SIEM Technology Assessment and Select Vendor Profiles” Publishes

by Anton Chuvakin  |  September 19, 2014  |  Comments Off

My other SIEM paper is updated as well: “SIEM Technology Assessment and Select Vendor Profiles.” It contains updated SIEM technology overview, some fun new trends, and refreshed vendor profiles.

Here is how you can use all my recent SIEM stuff:

What Do You Want? My SIEM paper to read
Figure how to buy the right SIEM and how to buy it right “Evaluation Criteria for Security Information and Event Management”
Understand SIEM technology better and become familiar with select vendors “SIEM Technology Assessment and Select Vendor Profiles”
Deploy the product and build your SIEM operation “Security Information and Event Management Architecture and Operational Processes.”
Take a very quick look at a typical SIEM architecture “Blueprint for Designing a SIEM Deployment”

P.S. Gartner GTP access required for all of the above!

Others posts announcing document publication:

Blog posts related to SIEM research:

Comments Off

Category: announcement security SIEM     Tags:

My UPDATED “Security Information and Event Management Architecture and Operational Processes” Publishes

by Anton Chuvakin  |  September 15, 2014  |  5 Comments

Finally, I completed an epic update to my 2012 paper “Security Information and Event Management Architecture and Operational Processes.” I think of this paper, interchangeably, as of SIEM’s missing manual” or a SIEM bible” … It now has expanded SIEM process guidance, new detailed use cases, more SIEM metrics, updated SIEM maturity framework and other fun new stuff – and of course a lot of the old good stuff that is still very useful for those planning, deploying and operating SIEM tools. It is LONG – but let me tell you – reading it is way cheaper than hiring 2 knowledgeable SIEM consultants for 2 weeks :-)

Some fun quotes:

  • “Organizations have to monitor complex, ever-expanding IT environments that sometimes include legacy, traditional, virtual and cloud components. Security monitoring in general, and SIEM in particular, become more challenging as the size and complexity of the monitored environments grows and as attackers, driven by improving defenses and organization response, shift to more advanced attack methods.”
  • “Ultimate SIEM program success is determined more by operational processes than by architecture or specific tool choice. SIEM implementations often fail to deliver full value due to broken organizational processes and practices and lack of skilled and dedicated personnel.”
  • “A mature SIEM operation is a security safeguard that requires ongoing organizational commitment. Such commitment is truly open-ended — security monitoring has to be performed for as long as the organization is in business.”
  • “A SIEM project isn’t really a project. It is a process and program that an organization must refine over time — and never “complete” by reassigning people to other things. Running SIEM as a project to “do and forget” often leads to wasted resources and lack of success with SIEM.”

Enjoy!

P.S. Gartner GTP access required!

Others posts announcing document publication:

Blog posts related to SIEM research:

5 Comments »

Category: announcement security SIEM     Tags:

Challenges with MSSPs?

by Anton Chuvakin  |  September 10, 2014  |  7 Comments

Let’s get this out of the way: some MSSPs REALLY suck! They have a business model of “we take your money and give you nothing back! How’d you like that?” A few years ago (before Gartner) I’ve heard from one MSSP client who said “I guess our MSSP is OK; it is not too expensive. However, they never call us – we need to call them [and they don’t always pick up the phone].” This type of FAIL is not as rare as you might think, and there are managed security services providers that masterfully create an impression in their clients’ minds along the lines of “security? we’ll take it from here!” and then deliver – yes, you guessed right! – nothing.

At the same time, I admit that I need to get off the high horse of “you want it done well? do it yourself!” Not everyone can boast about their expansive SOC with gleaming screens and rows of analysts fighting the evil “cyber threats”, backed up by solid threat intelligence and dedicated teams of malware reversers and security data scientists. If you *cannot* and *will not* do it yourself, MSSP is of course a reasonable option. Also, lately there have been a lot of interesting hybrid models of MSSP+SIEM that work well … if carefully planned, of course. I will leave all that to later posts as well as my upcoming GTP research paper.

So let’s take a hard look at some challenges with using an MSSP for security:

  1. Local knowledge – be it of their clients’ business, IT (both systems and IT culture), users, practices, etc – there is a lot of unwritten knowledge necessary for effective security monitoring and a lot of this is very hard to transfer to an external party (in our MSSP 2014 MQ we bluntly say that “MSSPs typically lack deep insight into the customer IT and business environment”)
  2. Delineation of responsibilities – “who does what?” has lead many organizations astray since gaps in the whole chain of monitoring/detection/triage/incident response are, essentially, deadly. Unless joint security workflows are defined, tested and refined, something will break.
  3. Lack of customization and “one-size-fits-all” – most large organizations do not look like “a typical large organization” (ponder this one for a bit…) and so benefiting from “economies of scale” with security monitoring is more difficult than many think.
  4. Inherent “third-partiness” – what do you lose if you are badly hacked? Everything! What does an MSSP lose if you, their customer, are badly hacked? A customer… This sounds like FUD, but this is the reality of different position of the service purchaser and provider, and escaping this is pretty hard, even with heavy contract language and SLAs.

In essence, MSSP may work for you, but you need to be aware of these and other challenges as well as to plan how you will work with your MSSP partner!

So, did your MSSP caused any challenges? Hit the comments or contact me directly.

Blog posts related to this research on MSSP usage:

7 Comments »

Category: monitoring MSSP security     Tags: