I am feeling adventurous, so let’s have an EDR mud fight [pillow fight?] – kernel or userland agent?
Top Pros | Top Cons | |
Kernel mode EDR agent |
|
|
User mode EDR agent |
|
|
As a quick side note, some EDR vendors’ agent code include both kernel and userland components, and while this helps with some cons of the “pure” kernel agent, it does not really mitigate the higher chance of stability problems issue.
To summarize, this is (IMHO) a fight between “Higher chance of system stability problems” vs “Higher chance of being subverted or avoided by the attacker.”
Add your own? Debate? Throw mud or a pillow? 🙂
Blog posts related to our current EDR research:
- Using EDR For Remediation?
- EDR Research Commencing: Call To Action!
- Where Does EDR End and “NG AV” Begin?
- Reality Check on EDR / ETDR
- My Paper on Endpoint Tools Publishes (2013)
- Endpoint Threat Detection & Response Deployment Architecture
- Essential Processes Around Endpoint Threat Detection & Response Tools
- Named: Endpoint Threat Detection & Response
- Endpoint Visibility Tool Use Cases
- On Endpoint Sensing
- RSA 2013 and Endpoint Agent Re-Emergence
- All posts tagged endpoint
The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.
Comments are closed
13 Comments
No question that the better visibility comes from the kernel, but at the same time, it actually adds 2 great benefits (on top of resilience) over userspace – speed and performance, and cross compatibility.
Moreover, userspace is not necessarily more stable, and is actually more vulnerable to interoperability issues, for example, because of the intrusive hooking. Also, while windows is actually decent in supplying hooking libraries, if you look at the Mac platform, the story is very different – rendering userspace virtually useless.
Its true that if you make a mistake while you’re at the kernel – the result will be almost a blue/grey screen, but if you make a mistake in userspace – you’ll just crash an app (which is still speak to stability!).
All in all – just make sure you write quality code 🙂
Thanks a lot for the insightful comment. I agree re: speed/performance, but compatibility…hmmm. Why? IF the OS vendor tweaks something deep inside the OS, the kernel approach may have problems.
ALso, what about >1 tool trying to live in the kernel and do similar things? Doesn’t it sound risky?
With MS its easy – once you’re certified by them or join the VIA/MVI – they will do a good job in giving you any heads up for material changes. Apple is slightly more chaotic 🙂
Re more than 1 tool – as long as everyone plays by the rules – should not be a problem. Btw Userspace is much more challenging in that regards, since its really hard to get two intrusive userspace products to work together.
Last point – doing an initial deployment of a tool on a machine that reflects the common corporate image reveals these problems quickly if they exist, allows to quickly fix any interoperability issues, and almost guarantees that you won’t see a blue screen in production…
I find your comments re “as long as everyone plays by the rules” and “reveals these problems quickly if they exist” to be excessively, even dangerously optimistic 🙂
Seems like a no-brainer. Do it right, do it kernel mode per Udi. Userland is trivially detectable, might as well stick to plain old AV if you go that route.
Agreed, of course. Userland is much less defensible, but what good is the kernel tool if there is resistance to deployment on stability grounds. In other words, what is better – a kernel tool on the shelf [or on 10% of systems] or a userland tool on all systems?!
I should have added before posting, in my (recent) EDR days, we did see the occasional BSOD being at kernel level. It’s a risk, but reflected more than anything gaps in QA and the benefits outweigh that risk. MS usually gives time for vendors to test and rarely tweaks the kernel anyway. The (very) rare exception is out-of-band patches to the kernel.
>the benefits outweigh that risk.
FOR WHOM? For the developer, it sure does. For the wide masses of users deploying it….. are you sure?
Kernel mode monitoring has few other benefits as well:
The kernel is the 1st part of the OS to load and the last to unload. As such it has the chance to whitness/act on the very birth and death of all user-land processes. it eliminates all race conditions (and there are plenty) that happen when monitoring from user-land. Most advanced attacks do not have any other logic other than performing the minimal task they are designed to do. It happens very fast, sometimes even before the security S/W hooks are initialized.
For EDR products, if you are willing to sacrefice performance in favor of accuracy and data reliability, it is advisable to perform some data hooking and enrichment in the kernel and relay the actual analysis to a user-space process. This way you can still mitigate the chance of being subverted by comparing data from different levels and minimize the system instability effects. Basically every AV today has a kernel component that monitors file-system or network activities.
Also, as a customer, I think I would prefer suffering from rare BSODs and pressuring the vendor for fixes until the product matures, instead of suffering from rare process crashes I wouldn’t even be notified about. This way I know when I am not protected.
Agreed re: needing both – agreed, userland is a must for some tasks and combined with kernel it becomes better.
Still, are you really sure re: “prefer suffering from rare BSODs and pressuring the vendor for fixes until the product matures, instead of suffering from rare process crashes I wouldn’t even be notified about”? How rare and on whose machine you’d accept the BSOD? Your own? Your CIOs? 🙂
I’m guessing every S/W has a maturity phase and both would suffer from crashes until they level. Speaking solely from security perspective, a total crash is preferable in the sence of visibility. A user-space process may crash without leaving apparent indications. Of-course this doesn’t take into account the other (business) side of this visibility and higher management patience 🙂
As a researcher I can also note, that crashing the monitoring S/W is also a potential attack vector. Disarming a user-space security measure is therefore simpler. Fooling a kernel sensor is a much more complex task as it may destabilize the entire system.
Great thread, Anton. Concur with Udi and Assaf. On Windows, we have good binary compatibility in the kernel across versions (mitigating a source of BSODS in days past), robust frameworks for intercepting activity (e.g. file system minifilters), well-defined support for interoperability with other vendors (e.g. altitudes and load order groups, annual PlugFest events). (Sadly, Apple isn’t anywhere near providing the same degree of kernel support or ecosystem maturity.)
For us, a kernel component is critical to get access early in the boot process and is necessary to do in-line prevention from the same point as detection. It is also necessary to participate in the trusted boot process and ELAM.
BSODs due to kernel security components seem largely a problem of the past given how standardized much has become (and how long most of us have had to get it right).
This is in stark contrast to the stability and compatibility issues we’re seeing with some exploit detection and mitigation approaches. User space system code can definitely be just as disruptive.
Thanks for the comment , Chris. Sorry for delayed response – I was on vacation for a week. In any case, I agree that vendors will experience doing kernel dev CAN write solid code there…. especially in v4 or later of their products.
Also, I fully agree that bad userland code can do some heavy damage too so it is not “magically safer”, no matter what.