by Jay Heiser | March 28, 2013 | Comments Off
We’ve riffed for years on the distinction between “Dr. No” and “Mr/Ms Yes”, but many enterprises continue to back the security professional into the awkward far corner of the Business Prevention Department. If the risk assessor is going to be blamed for security failures, then that person is always going to be motivated to make extremely conservative decisions.
The idea that risk can be understood and managed with the goal of reducing the potential for negative outcomes, and their impact, is not a radical one. This is what risk management is all about. Unfortunately, it can only flourish in an atmosphere of cooperation and team work. Blame cultures are not conducive towards making difficult decisions involving poorly understood forms of risk.
Employees operating within a culture of blame are motivated to value CYA at the personal level before the corporate one. If people feel they are going to lose their job, or experience losses of prestige or status, when they are associated with failures, then the organizational culture is providing them economic and social motivation to avoid risk. This counterproductive organizational dynamic plays out in spades in the intriguing yet ambiguous context of commercial cloud computing.
A blame culture typically approaches SaaS something like this:
- Somebody in the business thinks they can save money (or avoid IT’s annoyingly inflexible rules) buy using some kind of cloud service.
- They put together a business case that contains nothing but good news and beneficial financial outcomes.
- Contracting staff is asked to provide contract language that a) ensures that nothing bad can happen, and b) will be completely acceptable to the service provider (which has a reputation of not negotiating substantive contractual provisions).
- The IT contracting staff balks at this impossible task, it is treated harshly and is accused of empire building, and being non-cooperation.
- Meanwhile, the security staff is asked to approve a deal in which the buyer hasn’t stated their security requirements and the seller refuses to explain how their system actually works.
- The security staff balks at this impossible task, and is treated harshly. Treated as being deficient in imagination, it is accused of being out of touch and is characterized as participating in business-disabling power games.
- Provided with the binary choice, the people who have the expertise to understand and mitigate the risk do what the blame culture motivates them to do and say that they cannot approve this deal.
- The line of business makes it clear that they believe these in house functions cause more harm than good, and strongly suggests firing the lot of them.
The tragedy of this all-too-common scenario is that few, if any, of these people were actually dead set against the externally provisioned service in the first place. Life is full of ambiguity, and significant business decisions always require someone being willing to accept a risk. If the person who benefits from the positive outcome of a decision is also the person who will accept the blame for a negative outcome, then an organization is positioned to take advantage of new forms of service. If somebody wants to save money, while dumping the negative consequences into somebody else’s lap, it should come as no surprise that the owners of those laps have developed mechanisms for pushing back.
It takes a well-coordinated team to say yes to an ambiguous risk question.
Category: Cloud IT Governance risk management security Tags: risk assessment, risk management
by Jay Heiser | March 20, 2013 | 1 Comment
It would be the rare soul indeed, who, after spending hours or even days cleaning up from a hack, didn’t feel the strong red rage of revengeful urges. And how many PC owners or site managers, still recovering lost data, time, and pride, if presented an opportunity to strike back at their attacker, to make that anonymous bully feel the same pain themselves, would not be sorely tempted to undertake an act of violence and coercion themselves?
The idea that the victim of a computer crime might not only attempt to traceback the attack, but also to attempt some form of retaliation, is hardly a new one. Its a Gibsonesque theme that resonates through decades of cyberpunk novels. But it is the case that the volume of discussion around the topic has been ramping up, a form of legalistic debate that is probably indicative of the underlying smoke of mysterious attacks, and even more mysterious hackbacks. Now that the topic has been discussed in the hallowed halls of the US Congress, its more than ever likely to become a topic not just for the family dinner table, but for the corporate policy committee, and of course the national government.
It seems that the act of responding in kind to a computer attack is technically illegal in the USA—as it is in many places in the world. This is not something that has been widely tested through case law, and as a general legal principle, the right to self defense is widely recognized. But its a can of legal, practical, and moral worms.
Hackbacks are nothing new. Whenever value must be protected in an unregulated competitive system, individuals are economically incentivized to take the law into their own hands. Just as drug lords defend their honor and turf through physical violence, some cybercriminals resolve their disputes on servers with obscure domain names. Sometimes, a spammer, vandal, bot master, or criminal hacker has the misfortune to attack someone with the skills and personality necessary to respond in kind. This has literally taken place for decades, out of site, and out of mind for the overwhelming majority of Internet citizens.
As the impact of cyber crime continues to grow, it seems to inevitably lead to greater discussion about what to do about it. Historically, when populations become fed up with coercion and violence, they band together to promote self protection. Depending upon the degree of frustration, Neighborhood Watches can evolve into posses and even escalate to vigilantism. We are already seeing a form of that today with the self-styled Robin Hood approach of the loosely formed network army that refers to itself as Anonymous.
Without taking a stand on either the legality or appropriateness of hackbacks, I’m confident in saying that conducting reverse hacks is more than impractical for the overwhelming majority of Internet victims, and the potential for collateral damage to other hacking victims is extremely high. But I’m also confident in the expectation that as the feelings of digital victimhood continue to grow, the response will be demands for dramatic protective action. I really don’t know what form that will take, but the coming decade is likely to be an interesting one for both cops and robbers.
Category: Policy risk management security Tags: hack back, hackback, hacking, law, retaliation
by Jay Heiser | February 28, 2013 | Comments Off
Any time your internal policies include the lawyerly language “Includes, but not limited to…”, it should be a sign that somebody needs to reexamine the text.
This is often a sort of cop out, an admission on the part of the policy writer that they actually do not know what the rules should be—but a warning that if you do not follow these yet-to-be specified rules, you will be in trouble. It doesn’t constitute useful guidance.
Choose your policy battles carefully. There is only so much influence you can exert over end user behavior through written policies, so don’t squander the attention and patience of your users with vague warnings, puzzles, and scavenger hunts.
If you cannot tell your end users what specifically they must do, or must not do, and if you cannot provide them with useful principles that would reasonably allow them to figure it out on your own, then you’ve got no basis for a policy element.
Category: IT Governance Policy Tags:
by Jay Heiser | February 27, 2013 | 2 Comments
“We have decided to do this new thing. We think it has risks. What should we to to make sure that it doesn’t have any risks. This new thing that we’ve decided to do. Without knowing what the risks are, or whether the best practices for risk mitigation have matured.”
Category: risk management Tags:
by Jay Heiser | February 15, 2013 | 1 Comment
As 4,200 disgruntled holiday goers, trapped on the ironically named cruise ship Triumph, finally end their 5 day ordeal, it serves as a reminder that the eggs can have more stake in the state of the basket than the basket holder does.
From the point of view of the cruise line, each booked up ship represents a concentration risk, containing thousands of human beings, their fate, indeed their very lives, dependent upon the correct functioning of a very large and complex system. From the point of view of the passengers, a cruise ship represents recovery risk.
While cloud computing has been relatively smooth sailing for the majority of its passengers, there have been multiple multi-day incidents that required a recovery process of uncertain duration, with ambiguous hopes for success. There have even been clouds that ran aground in the shallow waters of a highly competitive marketplace, leaving their passengers permanently stranded.
Most cloud service providers are able to weather a single packet storm, returning to operational status and compensating their customers with credit for time lost, and maybe even a bit of extra credit. For those who haven’t had their enthusiasm permanently squelched by 5 days without toilets or dinner, the cruise line is offering free cruises to the victims of this mishap. Many of the unfortunate Triumph passengers have lost a week’s worth of vacation—something that they can never recover. Likewise, when a cloud fails, thousands of customers are likely to experiences forms of loss that cannot be compensated for.
Both the cloud and cruising industries have proven relatively reliable, but failures do happen. One lesson that cloud customers can take from a series of vacation-ending fires and floodings is that when a single incident simultaneously impacts thousands of customers, the recovery will be slow and frustrating, and the provider will have no way of compensating their customers for their lost time.
Category: Cloud risk management Tags: cloud failure, cloud risk, concentration risk, portfolio risk, recovery risk, risk, risk management
by Jay Heiser | January 9, 2013 | 1 Comment
Today’s library user takes electronic catalogs for granted. Being able to remotely search the contents of a library is not only convenient, but it also allows for a tighter integration between the lending practices—you can see if a book is loaned out.
During a period of several decades, a number of service firms made very profitable business out of the digitization of the paper-based library catalogs used by public, educational, and private libraries. Old fashioned card catalogs were a form of analog database, with each card constituting a single record.
The structured data: title, author, publication date, subject, LOC number, etc, could easily make the transition from paper form to electronic database. However, many librarians had added unstructured data to many of the cards in their catalog, including information on the quality and status of the book, and other comments that would be useful to either the maintenance of or reference to the book. These annotations represented a rich set of stored knowledge that were largely lost during the brute force digitization process.
Annotations are a form of metadata that, because of their informality, are typically not recognized as having organizational value. Life goes on, and the loss of a several generation’s worth of neatly scribbled notations around the edges of well-rounded index cards are hardly the biggest problem confronting today’s library.
Are we likewise putting organizational knowledge at risk by not providing our users with a robust and portable annotation mechanism to support their use of digital documents? This has obviously not been an unsustainable problem. Its debatable just how much electronic marking up has taken place on workstations and laptops, but the ubiquity of tablets, which are clearly much more convenient for the reading—and annotation—of longer documents and books, likely means that the sum total of digital annotations is growing at an accelerating rate.
What’s the value to the enterprise of the stored knowledge represented by digital document annotations?
Should the CIO be looking for ways to facilitate the creation and exploitation of this form of stored knowledge?
Does it represent a form of metadata that is worth managing and protecting to ensure that it is available as long as it is useful?
Category: Applications risk management Tags: annotations, metadata
by Jay Heiser | January 4, 2013 | 1 Comment
We’ve recently moved house, and my collection of books, many of them heavily marked up with multi-colored highlights, Post-Its, and bookmarks, remains something of a storage issue. Over the last several months, I’ve been experimenting with digital books on an iPad.
There’s a lot to be said both for and against services like Amazon’s Kindle and Apple’s iBook. The selection and convenience is a strong positive, and eBooks not only don’t fill up my groaning Swedish flatpack bookshelves, they also cost less, which is no small consideration for a heavy reader. I might read a paperback novel, or borrow one from the library, and never need to refer to the thing again. I’ve subscribed to a weekly UK photography magazine for 6 years, but it costs a lot more since we moved to the States. I rarely save the paper copy, so why not save some money and some trees by reading this, and other magazines, online?
However, if I spend hours working my way through a non-fiction book, marking it up and ‘penciling in’ comments, its done with the assumption of perpetual access to that book and my annotations. The primitive highlighting and markup functionality of Kindle and iBooks is annoying for the serious annotator, but my biggest concern about the commercial eBook model is that I’m totally beholden to the long term viability of the vendor. If I’m using a proprietary file format, locked up with a digital rights mechanism, I’m dependent upon access to that vendor’s server, and I’m dependent upon reliable support for my device (and its successors)—indefinitely. Its not a very open system.
On the plus side, if our house burns down, at least I’ve still got copies of all my eBooks. If I get stuck somewhere without my iPad, I can still access a relatively recent copy of an annotated book on my iPhone, and magazines can be downloaded on the fly. But for long term access to the intellectual property I’ve paid for, and for the added metavalue of my personal annotations, proprietary and rights-managed formats represent a significant risk. If the bookseller goes out of business, they take my books with them.
When you pay for paper, you are control of the destiny of that document, and all of the metadata that you and other readers have added to that information medium. When you pay for an eBook, you are only leasing it. That’s a great model for light reading, but its detrimental to long term scholarship.
Category: Applications BCP/DR Cloud risk management Tags: contingency planning, continuity, DRM, ebooks, Kindle, PDF, rights management, standards
by Jay Heiser | November 28, 2012 | 1 Comment
Anyone with a stake in the overall success of cloud computing should take a few minutes to read the recent NYT interview with Peter G. Neumann, a highly-respected computer security researcher who, now entering his 9th decade, continues to do ground breaking work on digital reliability.
Commercial cloud computing creates new levels of urgency for structural weaknesses that Dr. Neumann has been warning about for decades, including the the dangers inherent in complex systems and in monocultures.
Concerns such as this are often treated as being hypothetical—outside of the community of academics and government researchers who spend their lives working in the field of digital security. Neumann’s scientific opinion represents what is considered orthodox within this field.
There really is no room for doubt that the robustness of our current computing environment, not the least of which includes the complex Internet-enabled public ‘cloud’, is to a large degree dependent upon ‘band-aids’, and fails to take full benefit of a half century of research into computer security. The open question that Dr. Neumann cannot answer is how long this continues to be sustainable.
The reality of most of the human-designed world is that it is non-optimal, and kludged together, but we muddle along pretty well in spite of poor design and misplaced priorities. Today’s compute environment may last for decades, as we continue to extend last century’s flawed architectures and sloppy code across increasingly complex and exposed service offerings, patching security and reliability holes with digital chewing gum and baling wire. If this does eventually become unsustainable, its good to know that some highly-qualified researchers have been putting a lot of effort into ‘rethinking the computer.’
Category: BCP/DR Cloud risk management security Tags: complexity, Peter G. Neumann, security, security history
by Jay Heiser | November 2, 2012 | 2 Comments
Our home telephone is totally dependent upon the electrical power grid, and a lead acid battery of unknown age is all that stands between us and total loss of external connectivity.
Fiber to the home, which we’ve now had in 2 different houses, represents an opportunity for high speed, flexibility, and economics, providing a single source for television, telephone, and Internet. Unlike analog phones and broadcast TV, ‘advanced residential communications’ in ‘Smart Neighborhoods’ offering ‘Blazing Speed’ that will ‘exceed your expectations’ are totally dependent upon a powered-up interface box. Unlike an old-fashioned copper phone line, or a TV antenna, you can’t receive fiber optic transmission without a powered device that splits out the three services and interfaces them to the in-home wiring. If the power goes out, the fiber no longer blazes—it flares out.
In order to maintain telephone service, high-tech homes have a backup battery hidden in the customer premise equipment. Nobody claims they last over 8 hours, they are not routinely maintained, and common wisdom is that they often do not last that long.
It isn’t just the home fiber interface that requires power. The (currently unapproved) franchise agreement between our provider and the county requires 2 hour of backup for all distribution amplifiers and fiber optic nodes, 24 hours for all head end tower and HVAC, and at least one dispatchable portable generator to do something somewhere. I don’t know how reassuring that is to people who have already experienced 2 multiday power outages this year.
Clearly, there are reliability advantages to the plain old telephone system (POTS), which only requires emergency power at the central office. Given a choice, telecommuters with 2 lines sometimes do decide to make one of them analog—but increasingly, you don’t get that choice. Once a neighborhood switches over to fiber, the providers become extraordinarily reluctant to support copper. Our new neighborhood has no POTS, and the single telecom provider has exclusive cabling rights for the remainder of my lifetime—and well beyond.
Obviously, there are many advantages to wireless, which becomes the channel of choice when the home or office phone is powered out. Unfortunately, it tends to fail when it is most needed. After hurricane Katrina, the FCC attempted to force providers to include 8 hours of backup for all cells (which would barely last past the excitement of the storm). This 2007 blog post, correctly discussing the unlikelihood of that happening, states “Well, we are likely headed for the big one here soon and it stands to reason we’ll want to have some cell phone service in the aftermath. As we saw last month during a 5.6 earthquake, you don’t have to have cell towers go down to lose service. There was enough congestion in that first hour to bring conversations to a halt. But in a much bigger scenario, having additional power could keep information flowing in the hours after a disaster, helping speed aid and relief to the right places.” New York and New Jersey have just had their big ones, and information is still not flowing in the aftermath of that disaster.
Reporters based in New York city, and Gartner staff living in the areas hardest hit by Sandy have reported total failures of cell phone in their neighborhoods, with some providers apparently doing worse than others. The FCC reported yesterday that “the number of cell site outages overall has declined from approximately 25 percent to 19 percent” (the perceptive observer might ask, percentage of what population of sites).
In addition to significant traffic increases during a natural disaster, there are at least 3 reasons for cell phone failure, with the first one being particularly acute for cell systems:
- Power: Batteries get drained pretty quickly. While a growing number of cells do have generators, the generators need fuel replenishment, which in the post-Sandy world is becoming a logistical problem for several reasons. At the same time that the power grid is coming back online, a growing number of cell sites are running out of backup power.
- Physical damage: wind damage to antenna, or water damage to electronics can impact service, and it takes time after a disaster to deploy existing repair crews across a transportation-challenged region.
- Network failures: The backhaul networks between towers and the switching offices are subject to physical damage, especially from flood water, and they require electrical power (see 1 above).
There’s a lot to be said for the continuity advantages of POTS and analog phones, but other than rural areas, its likely to be phased out in favor of home digital connectivity and cell phones. If you want to do some contingency planning, you might want to scout your neighborhood for pay phones.
Specific details on the post-Sandy status of each wireless provider can be found in yesterday’s NYT blogs.
Category: BCP/DR risk management Tags: cell phones, Hurricane Sandy, power failure, redundancy, Sandy
by Jay Heiser | October 30, 2012 | 1 Comment
Preparing for Sandy’s imminent arrival, I didn’t fill up any bathtubs with water, but I did charge up all the phones, tablets, and MiFis in the house. Frankenstorm didn’t end up having a huge impact on my part of the country, and we never suffered a prolonged power outage. My son, holed up in his dorm at what is currently a very quiet university, has gone 14 hours without power. I suggested that it looked like he’d have an additional 2 days this week to study. He reminded me that all of his text books are digital.
Under pressure to reduce the weight and volume of printed matter in the house, I’ve been experimenting with eBooks on my iPad. Assured that I’ll love reading books electronically—once I get used to it—I’m still trying to figure out how to change the color of the highlighting. I miss all those colorful Post-It tabs sticking out the sides of the pages. Digital format seems like a great way to read things that you’ll throw away, like beach novels and magazines, but the annotation mechanisms are still weak, and the aesthetic satisfaction of a crowded bookshelf is totally missing.
Recognizing the convenience of being able to stuff multiple books and magazines, not to mention thousands of podcasts, into a single slim device, I’m ready for an upcoming multi-day trip. Even if I get delayed by weather, I should still have plenty to read. While my battery lasts. In many ways, the digital option is a lot more convenient, but its dependent upon external power. I wonder how many people Sandy has trapped between a tablet and an empty battery.
While it is way too early to begin collecting continuity and recovery lessons from Sandy’s aftermath, the fact that only one hospital outage has been reported, suggests that a lot of emergency power systems worked very well last night. NYU’s Langone Medical Center lost power last night (and less dramatically, Coney Island Hospital’s), and several sources today have reported that not only did the backup power fail, but also the backup to the backup. Back in June (the other ‘storm of the century’ earlier this year, not to be confused with last year’s storm of the century), Amazon experienced a similar (failure)3 when a single incident took both utility substations offline, followed by an overheated generator, and then a failure due to the misconfiguration of the secondary backup.
Anyone who has spent significant time dealing with data centers, or any other critical system, likely has multiple war stories about failed power. Its a mundane but important topic. Microsoft has been bemoaning the lack of researchers, developers, and engineers, but maybe what we really need are more mechanics and electricians.
Category: BCP/DR Cloud risk management Tags: contingency planning, electricity, Hurricane Sandy, power, redundancy, weather