Let’s play a little game: on the table in front of you are three black boxes. I want you to give me your credit card number and I will put it into one of those boxes to keep it safe for you. The boxes are labeled, with some detail about what goes on inside:
- Box A: Encryption – use me! I will encrypt your card with a 256 bit AES key, using FIPS-140 certified encryption, store the encrypted value where ever the card number is needed, and protect the key in this vault over here.
- Box B: Tokenization – use me! I will do the same encryption as dopey Box A, but I will only store the encrypted card number once here (and the key over there) but every where else I will simply store a token value (that I create) that can be used (over an authenticated protocol I built) to get associated data, but never actually get the card data.
- Box C: Data Striping: Choose me! I’m not going to do any of that silly encryption stuff. Instead, I will chop your card number up into little teeny, tiny pieces and store each piece separately scattered all over the world. When you need your card data back, I will retrieve it (using a key I created and stored over there) but no one else will ever be able to get your card number, even if they stole some of the disk drives storing some of those teeny tiny pieces.
We are all familiar with Box A, we use it all the time and there are well known standards for making sure it is done right – but it is very, very easy to do encryption wrong.
Avivah Litan and I wrote about Box B in Gartner Research Note “Using Tokenization to Reduce PCI Compliance Requirements” – where we pointed out there are no standards for tokenization and each implementation has to be evaluated individually. So, you still have proven encryption, but you’ve added some non-standard complexity.
Cloud storage providers, like Google and EMC’s recently announced data protection features in their Atmos cloud storage, use Box C to say their approach is secure, but how do they retrieve the data? They must keep some mapping of where those teeny tiny pieces went, or you’d never get your data back. Sounds an awful lot like an encryption key, just not done in any standard way.
The bottom line is it is all about how secure the “retrieval key” is in all 3 boxes. Box A has a major advantage in only using standard well know techniques, but of course ten years ago encryption was not standardized, either. Tokenization probably won’t take that long for some standards to evolve, as there are already existing protocols that aren’t too far away.
Data striping is a whole nuther deal – it really never started out as a privacy technology, it has been driven by an inexpensive way to raise availability of data and reduce storage retrieval times. Demonstrating the security level of the mechanisms used to protect retrieval are largely starting from ground zero – it may be more secure, it may not be. The actual strength actually doesn’t even matter if the security level of data striping be can’t be evaluated or demonstrated. Just “trust me, I know the Internet” isn’t enough.
Category: Uncategorized Tags:

John Pescatore




































































































1 response so far ↓
1 Cloud Computing: Will It Be Government’s Venus Fly Trap? March 4, 2010 at 9:27 am
[...] striping or scattering the data across multiple data centers in multiple countries solve the problem? Assuming (a very, very big [...]
Leave a Comment