Gartner says “Malware Is Already Inside Your Organization; Deal With It.” But you know what? I wish it were just stupid malware (well, some is not so stupid): via a plethora of remote access methods, human attackers are also inside. BTW, I don’t mean the actual “insiders”, seemingly nobody cares about those nowadays :–)
Result? 2014 Verizon Data Breach Report, everybody’s favorite security reference, reminds us that default, stolen, hijacked, or otherwise mal-acquired user credentials (whether privileged or not) play a role in most recent intrusions and data breaches. As a result, stolen credentials, guessed passwords, pass-the-hash and other means of gaining system access present an exciting challenge to those folks who already woke from the mortal slumber of “isn’t my firewall and AV enough?”
So, what weapons do most organizations have in their arsenals to deal with this problem, apart from good luck:
- Manual log review – well… good luck catching access from the accounts that “changed ownership” overnight. Even if you record and aggregate all authentication logs (a no mean feat!), finding those meaningful events is next to impossible.
- Scripts written to look for specific account abuses – sure, with 13400 lines of Perl (Python, if you have to) you can do anything, but we are talking about a lot work here; and not just coding work, but also research and data analysis.
- SIEM – with correlation rules like “if you see failed login 5 times, followed by a successful login with the same username, ALERT” and naïve baselining like “if 50% increase in failed login during the past hour, ALERT”, the scale is clearly tipped in favor of attackers.
That’s it! No, seriously, that’s it. I hope my readers will share what percentage of all user accounts across their organizations are, in their opinion, compromised (as in: owned by somebody other than their benign legitimate owner), but I am willing to be that we are talking about ~1% (similar to the number of owned boxes).
In fact, consider a 30,000 endpoint organization (which is not huge) – apart from the main Windows accounts for each IT user, there are internal application passwords, credentials for HR systems, expense management, travel sites, internal collaboration sites, etc. Sure, some are federated (i.e. enabled for “own one – own them all” malicious convenience….), but many are standalone. Who knows what those accounts are used for – and by whom? On top of that, those business that have online accounts – ecommerce, finance, etc have it much worse: not just employees, but customers have accounts too, and those may run into tens of millions (and, in some cases, into hundreds of millions and – for some social networks – into billions).
In essence, even those organizations that have “all the logs” and “all the context” (such as an IAM connectors to enrich log data) may still be utterly lost. Rules and simple baselines just don’t cut it for 50K user accounts used round the clock. They have ALL of the data and NONE of the insight, all the facts and none of the story. Sure, when things hit the fan, you will be able to get to the bottom of it – or those IR consultants you hire will, but there is no way to know before it is too late”…
How do we solve this?
Here is where the emerging UBA tools comes in – and these combine extended identity awareness (account username –> identity –> employee [the latter come via HR system data pulls … if you can pull it off, that is :–)]) and non-trivial statistical algorithms [sidenote: how do you quantify that? Handwritten rules and basic counting stats are trivial indeed, but what about other methods?]. Of course, the analytics will succeed if the hypothesis of “once the account changed hands it will behave measurably differently from its regular use” is true.
In any case, we need:
1. Data – sources for activity and context data, such as connection, login, system access, data access logs, but also those of interaction with business applications
2. Methods – algorithms, data enrichment approaches, visualizations
3. Specific use cases – e.g. stolen account logic may not work for counter-insider problems
BTW, in my upcoming paper, I will look at UBAs and other analytics-related security tools with this lens: source data, methods, use cases. For now, let’s think aloud here:
|Data||logs (from SIEM or directly), DLP alerts, flows, network metadata; IAM data, HR data, etc||system authentication logs + Active Directory user information|
|Methods||supervised machine learning, unsupervised learning, statistical modeling, etc||self-to-self comparison, peer comparison, activity model vs time|
|Use cases||compromised account detection, pre-departure data theft, employee sabotage, shared account abuse, etc||detect account takeover by a malicious external attacker|
Note that these are examples only, not representative of any particular vendor methodology. BTW, here is a fun sidenote: I was told [multiple times] that more UBAs are bought for APT/lateral movement detection over the insider threat; user behavior anomalies due to hijacked, shared, stolen credentials, NOT their abuse by the owner for their own M.I.C.E. motivations.
P.S. The post title is also wrong: it should have been “Those Pesky Users: How To Catch Bad Usage of Good Accounts AND Bad Usage of Accounts Created by Bad People”
Blog posts on the security analytics topic:
- Security Analytics Lessons Learned — and Ignored!
- Security Analytics: Projects vs Boxes (Build vs Buy)?
- Do You Want “Security Analytics” Or Do You Just Hate Your SIEM?
- Security Analytics – Finally Emerging For Real?
- Why No Security Analytics Market?
- SIEM Real-time and Historical Analytics Collide?
- SIEM Analytics Histories and Lessons
- Big Data for Security Realities – Case 4: Big But Narrowly Used Data
- Big Data Analytics Mindset – What Is It?
- Big Data Analytics for Security: Having a Goal + Exploring
- More On Big Data Security Analytics Readiness
- Broadening Big Data Definition Leads to Security Idiotics!
- 9 Reasons Why Building A Big Data Security Analytics Tool Is Like Building a Flying Car
- “Big Analytics” for Security: A Harbinger or An Outlier?
Read Complimentary Relevant Research
Organizing for Big Data Through Better Process and Governance
With big data past the Peak of Inflated Expectations on the Hype Cycle, organizations are addressing next-level challenges and asking,...
View Relevant Webinars
What Big Data Means Today and How to Position Effectively
Gartner's original prediction that the term "Big Data" would become meaningless by 2020 was actually a bit off its largely useless already...
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.