Tom Gorup, VP, Security and Support Operations, Alert Logic

April 22, 2020

Five contingency best practices for SOCs to handle uncertainty

With a crush of new teleworkers and a significant increase in endpoints coming online, we’ve entered into a new reality. COVID-19 has disrupted our lives and the business world – possibly for longer than we’d planned. Once the pandemic ends, companies may take six months to get up and running normally, according to a CNBC Global CFO Council survey.

best practices SOCs

The “new reality” extends to security operations centers (SOCs). SOCs are familiar with natural disasters and other inclement weather that includes floods, tornadoes and even ice storms, and it’s critical to keep a SOC operational in the event that there is reduced local staff or access to physical infrastructure.

SOCs operate as busy, open-office environments with team members working closely together to monitor and mitigate threats. Even with so many employees working remotely, you want to find a way to continue to facilitate those impromptu exchanges, during which newly discovered problems are discussed and often resolved.

The loss of available personnel (due to illness or communications outages) and solutions/resources (due to disruptions) is something you want to plan for if you haven’t already. If you’re a CISO or other manager who oversees SOCs, you need to adjust to these times and others you’ll face in the future with a risk-based assessment of your people and resources.

You need to determine what would change should some percentage of them become unavailable, how this would impact operations/business obligations, and how to respond to reduce negative outcomes. In pursuing such an assessment and other proactive contingency planning, here are five best practices to consider.

Implement a follow-the-sun strategy

Establishing SOC operations and personnel in dispersed geographic regions reduces the pressures that would come with operating with a skeleton staff and lessens the chance of major impact. When one location experiences pressure due to disaster, weather or another circumstance, the other locations can step up to ensure SOC functions are not interrupted.

Prioritize your resources

It’s important to identify the top resources for the SOC: the VPN, ticketing systems, cloud infrastructure assets, etc. Then, you want to determine which capabilities you would lose if those assets went down, and how this would impact service-level agreements (SLAs) and additional business-critical functions.

Your risk-reduction strategy should ensure that “minimum acceptable” business disruption is the worst-case possibility, no matter which technologies are affected and how severely they are damaged. From there, you build up scenarios to depict what business operations will look like in going from “minimum acceptable” with a significant number of resources down, to increasingly productive cases in which you have more resources up and running.

Then, you should think about your connectivity back-up plan. What would happen if your chat functionality went down? What if your phone system was no longer available? How does your SOC team react in these situations to enable business to continue?

A sound game plan begins with multiple fallback options for every form of communications that your team relies upon. If you’re only using a single VoIP solution for phone and video conferencing, for example, then make sure your employees can quickly switch to a secondary messaging solution if phone/video conferencing services go down.

Having multiple licenses for multiple communications forms increases the likelihood that “impact” doesn’t shut everything down. Take a look at the breadth of tools available to you today, more often than not you will find additional solutions to support you in your BCP.

Don’t neglect the “people” part of the picture

It’s not all about tech – employees are a crucial resource as well. As indicated, you will face the realities of sicknesses, a distributed workforce and potential internet/communications outages during a pandemic or other natural disaster or inclement weather.

As part of your risk assessment, ask yourself: “What is the least amount of staffing I need to still deliver meaningful support for business units, and reduced incident response time?”

Again, while you may still see decreases in business functionality and response capabilities, you can determine what the minimum acceptable levels of these are. You can then map out what your team performance and priorities will look like with varying count of absent staff, and estimate whether you’ll meet (and ideally exceed) the minimum acceptable levels in either scenario.

Keep a watchful eye

Once you have mapped your tech resources and people, you should invest in monitoring tools which will track your staffers and solutions while knowing where all of your single points of failure are, and how these failures could affect business-critical functions.

Organizations should re-evaluate their managed detection and response (MDR) capabilities and assess new providers if there are obvious gaps that need to be addressed quickly. Again, as part of a risk-based assessment, you are monitoring to get a better sense of what you are obligated to do; track the personnel and tools you require to do it; and effectively respond if you no longer have certain employees and/or tools in place (either temporarily or for an extended period).

Take it to the cloud

The more you invest in cloud-based tools for your SOC, the better prepared you’ll be for COVID-19 and any other health or disaster-related event which threatens to disrupt your operations. That’s because the cloud is obviously not confined to a specific, physical location.

Fortunately, organizations are universally looking to make these investments, as 97 percent plan to either move “some or all” of their existing SOC analytics infrastructure to the cloud, replace on-premises security analytics solutions with native cloud-based alternatives, or supplement on-premise analytics tech with additional cloud-based capabilities, according to research from the Enterprise Strategy Group.

We have never been through anything like COVID-19 and, hopefully, we never will again. But there will always be hurricanes, tornadoes, ice storms, earthquakes and wildfires. Cyber attackers won’t “stand down” during these times. In fact, they’ll likely seek to exploit the opportunity.

That’s why CISOs and SOC managers must incorporate risk assessment and “what if?” planning into their entire business-supporting ecosystem – both people and “parts” – to keep everything running. With this, they’ll prepare themselves for anything that comes their way, regardless of the nature of the disaster.

More about

Five contingency best practices for SOCs to handle uncertainty

Implement a follow-the-sun strategy

Prioritize your resources

Don’t neglect the “people” part of the picture

Keep a watchful eye

Take it to the cloud

Featured news

Resources

Don't miss