If a piece of sensitive data is visible to everybody with access to organization’s network (such as posted to an internal file share), is that a data breach? Most people will say “no.” However, what about an organization with 100,000 users, lots of Internet links and many facilities? Such an organization can never have tight controls over who can access to its network and also have at least a dozen of compromised/infected endpoints at its network at any time? All the whining about “perimeter is dead” somehow did not get connected to “the whole data breach thing.”
So, if …
- You have many thousands of users on your network
- You do not have tight Internet access policies, and old accounts are not always removed
- Your VPN access is via username/password, and such users are given access to the entire internal network (“just like in the office”)
- You probably allow BYOD
- Your anti-malware endpoint coverage is at 99% with 99% of them updated (=you have thousands of machines with no effective AV – eh… provided you consider AV effective )
- You have legitimate and rogue wireless access all over place.
- (in other words, you are a “typical” large company today)
- You do not consider an Excel spreadsheet full of credit card numbers on an internal file share to be a data breach?
In any case, with this long preface I wanted to focus on DLP discovery capabilities. “Security industry lore” indicates that at least some of the recent data incidents involved theft of data from “other than authorized” locations. Indeed, why hack SAP, if you can own a mere workstation where Excel exports from the same application abound? Why hack payment systems if you can own test systems, where [in dire violation of PCI DSS] the same data resides? Why compromise a database inside finance enclave, if you can break into a backup server inside IT? The data in such locations is not as well protected (or: not protected at all), not encrypted and access to it is not logged. It is essentially just there for the taking. Free data!!!
Thus, the phenomenon of “internally lost data” is way more pervasive than most people think. I’d bet if you think that it is pretty pervasive, then it is EVEN MORE pervasive Confidential, regulated and “merely” sensitive data on “all access” internal file shares, SharePoint boxes, team web servers, internal blogs, etc is literally all over the place.
And the fact that you don’t know where the data is, does NOT mean that the attacker won’t either. Back in 2002, when I was heavily involved in honeypot research, we had cases of attackers (who used to be called “script kiddies” back then) deploying simple data discovery scripts as a part of their initial system takeover (along with backdooring, IRC botting and patching the holes they used to break in). Do you think this knowledge is lost in the underground? No it is not. Thus, you cannot simply rely on obscurity of such data and the size of your messy, confusing IT environment
Now, one may try to say that, as far as DLP technology is concerned, it is STILL more useful to detect what is being stolen now vs what is being exposed to the internal audience. “Yes, but!” If you only look at what is being taken out now, you are going to lose a DLP battle after a protracted, painful and frustrating fight. On the other hand, if you tighten down what is exposed internally AND watch for what is being taken out, you can lose the same battle with a lot more honor.
Moreover, you can do it even better: “sniff –> scan” approach worked well for some organizations. They first saw *it* on the wire, got mad – and then got curious: just where exactly is it stored internally? “Oh, in 537 different places!” Next they fought the battle for reducing the internal exposure and then – surprise! – the occurrences of that piece of data being seen on the wire decreased as well…
So, if you got a DLP tool, plan for using its discovery capabilities. Hit those shares, SharePoints, team servers, intranet web sites, etc, etc. And, yes, you need a process, not just a tool!