The Quest for Resilience: Has the Pendulum Swung Too Far from Prevention?
Carl Jung is credited as having said “The pendulum of the mind oscillates between sense and nonsense, not between right and wrong” (emphasis mine) which succinctly captures the essence of a lot of human thought, not least in cybersecurity. Pendulums are also great metaphors for human thinking because—like an idealized theoretical model of a pendulum—external forces don’t necessarily cause things to return to an equilibrium position. Technology generally and cybersecurity specifically are hardly immune from extreme pendulum swings. Despite some significant improvements in practices and outcomes, the industry continues to struggle which raises questions about whether the pendulum is swinging the right direction or if it has swung too far.
The early days of cybersecurity were all about trying to prevent virtually every bad thing from happening. Indeed, to hear us talk, you’d think we collectively put a lot of stock in an old adage coined by another influential thinker: “an ounce of prevention is worth a pound of cure.” Being proactive instead of reactive certainly gets a lot of lip service but—again as with that idealized theoretical model of a pendulum—when we actually encounter the forces of the real world, the math gets a lot more… complicated.
The industry’s journey to the recognition that total prevention is impossible has been slow and arduous but the idea of resilience, made possible through advancements in cloud-native principles and technology, is finally getting meaningful traction. In some respects, however, the new emphasis on resilience has simply permitted us to acknowledge the status quo: that despite our aspirations of being proactive, we are—more often than not—just reacting.
Detection and response is a textbook example of “reacting”
Resilience is all about something’s ability to recover from some kind of stress or other adverse condition(s). As a discipline, detection and response would seem to align perfectly. In reality, however, it’s closer to rehabilitation than resilience because detection is the process of identifying the adverse condition which, in all likelihood, has already been going on for some period of time. Whereas resilience suggests a generalized resistance and ability to recover autonomously, detection is about identifying specific adverse conditions. It is diagnosis, rather than immunity. It is the equivalent of “see something, say something” for an organization’s socio-technical systems.
Response, naturally, is the other side of the coin and it is also a reaction. Having detected something, steps must be taken to contain and remediate it. The reaction has shifted gears from determining that something is wrong to cleaning up the mess.
It is not that detection and response are ineffective or unnecessary, it is that it is most valuable in situations where there isn’t a deluge of events and information that need to be investigated. Imagine if a plurality of open windows or doors in the kinetic world needed to be treated as potential break-ins, rather than just the ones showing clear signs of forced entry. That’s often the situation in the cyber world because the fidelity of our alerts is so low. The problems this creates—alert fatigue, burnout, etc.—and its second-order effects like unidentified and unaddressed incidents and breaches are, by now, very well understood by cybersecurity practitioners and leaders alike.
But we do take preventative measures!
No doubt, there is a chorus of cybersecurity professionals saying that they have implemented all kinds of preventative controls. While that is inarguably true, the implementation of these controls reflects our interpretation of specific information at a specific point in time. That is, as much as we talk about the evolving threat landscape, much of our decision making is relatively static because reversing earlier decisions can be very difficult—particularly in the absence of a major precipitating event. Even in cases where new information is readily available, change is still hard. A lot of existing IT infrastructure is like asbestos: it was hailed as a miracle, widely used, and later found to be very dangerous especially under certain conditions. It’s been nearly 50 years since we learned about the risks of asbestos but we’re still in the process of remediating it since wholesale removal just isn’t feasible. Our IT infrastructure is much the same in this regard: installed before we fully understood the extent of the risks, still in use and providing value, and—because of that—very difficult to remove.
Moreover, a lot of what passes for prevention is a combination of “ticking boxes” and scanning. It is proactive but its primary benefit is in demonstrating that we aren’t being grossly negligent. As with detection and response, it is not that taking these steps is ineffective or unnecessary, it is that they have practical limitations and a relatively short life span because—as we all know—the threat landscape is constantly evolving.
Prevention supports resilience
In looking at the role of prevention in resilience, it’s worth pivoting from the experiential definition of resilience to the physical definition: the ability of something to return to its original shape after some stressor forces it out of shape. This definition implies that the object in question has some resistance to becoming misshapen in the first place. If every force that acted on a thing could cause it to become deformed in some way, that thing could probably not be relied upon for much, even if it did possess the ability to return to its original form.
Prevention, then, supports resilience by reducing the number of stressors that can act on an object, by limiting the number and types of things that cause deformation and reducing the number of scenarios that require recovery. Effectively implementing a real prevention strategy in an IT environment makes it more resilient by conserving both human and technology resources. Rather than keeping infrastructure and security teams on a hamster wheel of patching and alert triage, they can devote their time and energy to making real and meaningful long term changes to support the organization as it adapts and evolves, rather than what amounts to constant fire fighting.
Conclusion: Finding equilibrium
If decades of work to improve cybersecurity have taught us anything, it’s that there is no one right way to keep our systems safe. The key is striking a balance and not over-correcting by letting the pendulum swing too far in any one direction—by not letting the shiny, new best practice clobber practices that still add value. In the end, we might consider a variation of the Serenity Prayer for cybersecurity practitioners:
Grant us the resilience to recover from the things we cannot prevent,
The people, processes, and technology to prevent the things we can,
And management that understands the difference.