Blog post

Those Pesky Users: How To Catch Bad Usage of Good Accounts

By Anton Chuvakin | February 19, 2015 | 2 Comments

SIEMsecuritymonitoringinsideranalyticsData and Analytics Strategies

Gartner says “Malware Is Already Inside Your Organization; Deal With It.” But you know what? I wish it were just stupid malware (well, some is not so stupid): via a plethora of remote access methods, human attackers are also inside. BTW, I don’t mean the actual “insiders”, seemingly nobody cares about those nowadays :–)

Result? 2014 Verizon Data Breach Report, everybody’s favorite security reference, reminds us that default, stolen, hijacked, or otherwise mal-acquired user credentials (whether privileged or not) play a role in most recent intrusions and data breaches. As a result, stolen credentials, guessed passwords, pass-the-hash and other means of gaining system access present an exciting challenge to those folks who already woke from the mortal slumber of “isn’t my firewall and AV enough?”

So, what weapons do most organizations have in their arsenals to deal with this problem, apart from good luck:

  1. Manual log review – well… good luck catching access from the accounts that “changed ownership” overnight. Even if you record and aggregate all authentication logs (a no mean feat!), finding those meaningful events is next to impossible.
  2. Scripts written to look for specific account abuses – sure, with 13400 lines of Perl (Python, if you have to) you can do anything, but we are talking about a lot work here; and not just coding work, but also research and data analysis.
  3. SIEM – with correlation rules like “if you see failed login 5 times, followed by a successful login with the same username, ALERT” and naïve baselining like “if 50% increase in failed login during the past hour, ALERT”, the scale is clearly tipped in favor of attackers.

That’s it! No, seriously, that’s it. I hope my readers will share what percentage of all user accounts across their organizations are, in their opinion, compromised (as in: owned by somebody other than their benign legitimate owner), but I am willing to be that we are talking about ~1% (similar to the number of owned boxes).

In fact, consider a 30,000 endpoint organization (which is not huge) – apart from the main Windows accounts for each IT user, there are internal application passwords, credentials for HR systems, expense management, travel sites, internal collaboration sites, etc. Sure, some are federated (i.e. enabled for “own one – own them all” malicious convenience….), but many are standalone. Who knows what those accounts are used for – and by whom? On top of that, those business that have online accounts – ecommerce, finance, etc have it much worse: not just employees, but customers have accounts too, and those may run into tens of millions (and, in some cases, into hundreds of millions and – for some social networks – into billions).

In essence, even those organizations that have “all the logs” and “all the context” (such as an IAM connectors to enrich log data) may still be utterly lost. Rules and simple baselines just don’t cut it for 50K user accounts used round the clock. They have ALL of the data and NONE of the insight, all the facts and none of the story. Sure, when things hit the fan, you will be able to get to the bottom of it – or those IR consultants you hire will, but there is no way to know before it is too late”…

How do we solve this?

Here is where the emerging UBA tools comes in – and these combine extended identity awareness (account username –> identity –> employee [the latter come via HR system data pulls … if you can pull it off, that is :–)]) and non-trivial statistical algorithms [sidenote: how do you quantify that? Handwritten rules and basic counting stats are trivial indeed, but what about other methods?]. Of course, the analytics will succeed if the hypothesis of “once the account changed hands it will behave measurably differently from its regular use” is true.

In any case, we need:

1. Data – sources for activity and context data, such as connection, login, system access, data access logs, but also those of interaction with business applications

2. Methods – algorithms, data enrichment approaches, visualizations

3. Specific use cases – e.g. stolen account logic may not work for counter-insider problems

BTW, in my upcoming paper, I will look at UBAs and other analytics-related security tools with this lens: source data, methods, use cases. For now, let’s think aloud here:

Details Example
Data logs (from SIEM or directly), DLP alerts, flows, network metadata; IAM data, HR data, etc system authentication logs + Active Directory user information
Methods supervised machine learning, unsupervised learning, statistical modeling, etc self-to-self comparison, peer comparison, activity model vs time
Use cases compromised account detection, pre-departure data theft, employee sabotage, shared account abuse, etc detect account takeover by a malicious external attacker

Note that these are examples only, not representative of any particular vendor methodology. BTW, here is a fun sidenote: I was told [multiple times] that more UBAs are bought for APT/lateral movement detection over the insider threat; user behavior anomalies due to hijacked, shared, stolen credentials, NOT their abuse by the owner for their own M.I.C.E. motivations.

P.S. The post title is also wrong: it should have been “Those Pesky Users: How To Catch Bad Usage of Good Accounts AND Bad Usage of Accounts Created by Bad People” 🙂

Blog posts on the security analytics topic:

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed


  • For those environments unable to wield the power of ML algorithms, the answer is one word: baselining.

  • @Anton Naive baselining like “50% up” is probably wrong in 99% of cases for finding maliciousness 🙂