Zoom in crisis: How to respond and manage product security incidents
Zoom is in crisis mode, facing grave and very public concerns regarding the trust in management’s commitment for secure products, the respect for user privacy, the honesty of its marketing, and the design decisions that preserve a positive user experience. Managing the crisis will be a major factor in determining Zoom’s future.
The company has recently skyrocketed to new heights and plummeted to new lows. It is one of the few communications applications that is perfectly suited to a world beset by quarantine actions, yet has fallen from grace because of poor security, privacy, and transparency issues. Governments, major companies, and throngs of users have either publicly criticized or completely abandoned the product.
No company wants to be in this position: faced with dealing with mistakes publicly at a time when they are experiencing unimaginable growth. Zoom is sputtering to stay relevant, fend off competition, and emerge intact.
Knowing how to respond and manage product security incidents is becoming more important for digital companies. Zoom is an excellent test-case to explore the lessons in crisis management. These lessons are valuable to every product and service organization which could face a loss of customer confidence. It would be wise for business leadership in every industry to take an introspective look and understand how they can effectively respond during such a crisis. Preparation provides an advantage and gives insights that may help avoid catastrophe.
Cybersecurity is a discipline in managing the risks to security, privacy, and safety. It does not eliminate them, but rather seeks to find an optimal balance between the risks, costs, and usability. That means there will always be a chance for undesired impacts. If managed properly from the onset, the minimization of those residual risks can also be handled in ways that reduce the negative effects.
Crisis response is a specialty that benefits from forethought, experience, leadership, and skills.
I have lead crisis response teams over the years and been fortunate to be part of strong teams that handled events with speed, efficiency, and professionalism. I have also witnessed complete train-wrecks where the wrong people were attempting to lead, focus was misplaced, valuable time and resources were squandered, legal instruments were applied to hide the truth, communication was confusing, and feeble attempts leveraging marketing to “spin messages” were preferred over actually addressing issues head-on. Poor leadership is caustic, can result in more problems and a prolonged recovery.
Crisis response is a complex dance. It requires a clearly defined objective to pursue and an understanding of the opposition, obstacles, and resources. Executive support is required, but not necessarily welcome in all decisions. Time is a crucial resource as is the morale and commitment of employees. It is normally a thankless job, as the best-case scenario is the situation is resolved and quickly fades from memory.
But enough with the platitudes. Let’s dive into some specifics with an interesting use-case which is currently unfolding.
Zoom crisis: The test-case
Zoom has a number of technical, behavioral, and process issues to address, in order to dig themselves out of the hole in which they currently find themselves. The goal of their response should be to restore the confidence in the Zoom products and its organization. To do this, the company must evolve to better proactively manage the risks of product vulnerabilities, avoid design decisions that weaken privacy and allow for abuse, and foster trust by being accurate and transparent with users, regulators, and stockholders. Every crisis that is comprehensively managed is painful, as it requires accountability, commitment, and disruptive change.
Let’s go down the list of challenges and best-practices.
First and foremost, it takes executive management support for an organization to rally together to address a significant crisis. Time, resources, and even goodwill must be applied from across the company. There are opportunities that must be sacrificed and trade-offs made. Fortunately, Zoom’s CEO has aggressively come forward to recognize the issues, personally took responsibility, and committed to restore trust.
Although falling on one’s sword is not necessary for a CEO, it does eliminate much of the wasted time normally allotted to the blame-game, finding a scapegoat, or being lured by the attractiveness of trying to use marketing tricks to spin or change the narrative. Quickly and openly taking responsibility for shortcomings is a shortcut to align focus toward resolution and shows seriousness in ensuring processes will be in place to protect from future issues.
A strong and capable leader is required to oversee a crisis. It is a specific discipline and not one recommended to be led by the inexperienced. Assigning the wrong person to lead a crisis is the single greatest mistake I have seen in the past.
Marketing and legal people should be part of the team but never lead the crisis response. They look at crisis events through the lens of what they know and the capabilities they can bring into play. They immediately move to conceal, deny, ignore, find blame elsewhere, or focus on spinning the media messages rather than addressing the root problems. This can work to distract for a time and delay some pain, but is not the best path to an expedited, comprehensive, and sustainable solution. In fact, their actions can cause considerable deterioration of the already weakened trust by consumers.
CEOs should initiate, support, define the goals, approve major changes, deliver sweeping announcements, and identify a crisis leader, but not take charge. Again, a specific set of skills are required. Can a CEO get the job done? Potentially, but as most executives are not savvy in this area it would be a major struggle; they need to leave it to professionals. A good crisis leader will work closely with the C-suite every step of the way and make sure the right path is enlightened and understood so management can confidently support progress forward.
Although it may seem counter intuitive, engineering should not lead either. Engineers are an integral part of the resolution for design and coding issues, but they should not be leading. They know the technical aspect of the product or service and will be the mighty tool to fix many of the vulnerabilities. However, what Zoom and most other companies face in situations like this includes a combination of technical, behavioral, and process issues. Looking solely through the goggles of an engineer, one only sees part of the problem set and mistakes it as the entire picture.
An experienced crisis manager that understands risks will develop more comprehensive plans that align with the long-term capabilities to prevent recurrence and support the short-term acts necessary to restore trust. They will engage engineers and developers with a prioritized list for them to resolve the technical issues in concert with other efforts necessary to achieve the overall objectives.
Identifying and addressing the root cause is crucial. Analysis will provide insights to what problems have arisen and also highlight what may be next. If the origins are unknown then the chances for another crisis remains high. Proper crisis response is not just about putting out the immediate fire, but also making sure when things are rebuilt, they aren’t vulnerable to the same issues.
For Zoom the likely root cause was due to the over prioritization for rapid Go-to-Market efforts that fueled a de-prioritization of product security and overzealous marketing which didn’t put enough weight in being clear and truthful when it comes to privacy and security. This means there are probably many other vulnerabilities lurking in the product, possibly some sensitive customer data has been gathered as some point, inaccurate marketing materials may be floating about, and the developers are likely not savvy when it comes to security and privacy as part of the Development and Operations (DevOps) lifecycle. The good news is that all these issues can be addressed and if done correctly will result in the organization and products becoming stronger and more competitive.
Stop the bleeding. Aligning resources and resolving the most relevant immediate issues of the customers is the top priority. The first step is to freeze all work on new features and reallocate those technical folks to understand and address the known vulnerabilities. This requires time and engineering resources across development, testing, and validation domains. As part of this effort, the underlying configuration issues causing severe user-experience friction (e.g., Zoombombing, session hijacking) or regulatory non-compliance (e.g., privacy) must also be resolved.
In parallel, work must be initiated to address what is not publicly known, which may likely erupt and significantly add to the chaos. What other related issues exist that may have been ignored? With a root cause being people choosing to not invest in security, there are likely advocates in the organization who have been trying to raise issues. It is time for their vindication. These insights, reports, and champions can give great insights to other areas requiring immediate attention.
Setting clear and realistic expectations with customers is very important, as these steps can take some time to complete and may need to be done in stages. This is not the time for marketing spin. Honesty and transparency, mixed with a touch of humility, and presented in a professional manner will lay a foundation for trust. Select executives must be prepared to engage with the customers, resellers, suppliers, vendors, etc. in an open, consistent, and well-informed way. It is okay to not have all the answers and instead communicate how the organization will get there.
For Zoom I would recommend the following:
- Scan the corporate, vendor, and partner environments for customer data that falls outside of the policy and move to delete. If required by law, notify users.
- Proactively engage privacy regulators and customers to outline what steps are being taken to respect their privacy, both in the short and long term, and the processes that will be instituted to provide transparent oversight for their benefit.
- Conduct a vulnerability scan of code, dependencies, and libraries. Professional tools and services should be used. Do not rely upon the knowledge base of the developers. Resolve or mitigate the detected issues and be prepared to provide the audit and supporting proof.
- As part of a security assessment, form an internal blue team to identify technical, configuration, and usage issues that could undermine security, privacy, and trust. This should be a cross-discipline team, not just engineers. Pull from marketing, management, sales, etc. to get the widest possible perspectives. This activity can happen quickly and provide important user-facing issues.
- For a deep-dive assessment, a professional external red team is required. Hire a reputable team and make it a priority for in-house product engineering to help the red team begin their work. This takes time but will find a much more in-depth set of vulnerabilities. No product team initially likes this process, but they will come to respect it and become better engineers because of it.
- Adopt an industry-proven end-to-end encryption technology. For Zoom this is foundational to the restoration of trust and continue patronage by security-conscious customers. Encryption is not easy. Seldom does a product organization get it right and even getting part of it wrong undermines the whole structure. Do NOT attempt to build or configure this internally. Trust factors are at play here. There are solutions in the industry that are vetted and solid for comprehensive and sustainable data security across untrusted networks and devices. Implement one and be prepared to announce what is being adopted. Good encryption does not require algorithm or configuration secrecy. There will be questions, many of which will need to go to that vendor, so choose wisely.
- Ensure all code changes go through rigorous tests and validation before being rushed into a patch. A poor update can cause major outages, unanticipated issues, and be the cause of even more problems. Now is not the time to take shortcuts. Move as quickly as possible, but adhere to quality control standards.
Marketing will have the challenge of expressing the proactive changes without overselling the credibility. The Advisor role, DPO, and CISO must be competent, experienced, and willing to work with marketing to engage industry experts and the media in pragmatic ways but not contribute to unnecessary news cycles that prolong negative sentiment.
Zoom should adopt all the leadership recommendations, as they overlap and support each other. Understanding and accountability must originate from the top and established for data privacy, infrastructure security, and processes incorporated into product development.
In addition to a Security DevOps champion, products require intense and varied testing to detect vulnerabilities. Some of this can and should be done internally for known vulnerabilities, but a professional community is required for a deeper scan to detect unpublished weaknesses. The use of bug bounties, penetration testing, and red teams is an industry best practice. Vulnerability management is a continuous process that begins in development but must persist well after product release and throughout the lifecycle as new vulnerabilities are discovered. It must be put in place to adopt this new way of thinking and operating.
Product vulnerability lifecycle
Recommendations for Zoom to better manage their product vulnerability lifecycle:
- Work with an established bug bounty vendor to set up a continuous program, offering in aggregate ~$1 million in bounties. This economic incentive will draw a global community of security researchers and ethical hackers to thoroughly scrutinize your product in ways you cannot. They will provide you with the data before malicious hackers can take advantage. It is an incredibly powerful decentralized resource.
- Incorporate a code vulnerability scanner into the DevOps processes. Commercial tools and services are available that scan code or match to third-party libraries and dependencies to vulnerabilities. This becomes a learning tool for your developers as much as it is a security assurance control. DevOps will get better at security over time, thus being less of a productivity sink while accelerating release times for secure products and features.
- Red teams and penetration testing services are expensive, but return a methodical set of results that provide very strong assurance. Incorporate such capabilities for major releases and to prove that critical security holes are actually patched.
- Blue teams are less expensive but still provide value that other controls may overlook. They will find many of the misconfiguration, misuse, and oddball feature settings which can cause user stress by undermining security and privacy. Incorporate a lightweight blue team review for every update that touches the user interface (UI) or any administration function.
- Establish a process for researchers to confidentially engage the product security team to disclose new vulnerabilities. Respect, recognize, and reward those who do.
- Make sure that, by design, the product can be effectively patched. It seems basic, but the details can be tricky. There should also be a way of verifying the patch was successfully installed. Metrics for compliance are important, especially during crisis events, as it will be one of the determining factors for when the crisis can be closed.
Incorporating these process enhancements will effectively establish an aggressive and proactive capability to find new vulnerabilities and maintain product security. Over time the organizations’ capability to produce and sustain secure products will continuously improve. It can be a significant competitive advantage on several fronts.
Privacy and the protection of data are also important. It is a responsibility shared among data owners, the DPO, and the CISO. Process improvement and accountability are expected when crisis situations highlight a lack of confidence in the current system and controls. When trust has been undermined, an independent third-party must conduct regular audits. These audits confirm compliance with the policies. They are valuable as a tool to strengthen customer confidence and for discussions with regulators. Zoom should establish a SAS 70 Type 2 type of recurring audit for data acquisition, security, and sharing. For the greatest level of trust, craft the audits so the results can be made public every year.
Establishing a DPO, updating data policies, instituting proper governance and oversight, and acting with transparency with regards to the checks and balances will set the organization on an admirable path that will build credibility as an asset. Privacy and data security continue to grow as important aspects of business. Zoom has an opportunity to showcase respect and responsibility if they maneuver correctly to embrace industry best practices.
I have covered some of the fundamentals for product security crisis response and done a walkthrough of what I would do, beginning Day 1 of leading a crisis response for a Zoom-type incident.
This is just a taste and not a comprehensive compendium. Cybersecurity crisis management is very complex and difficult. Being in the jaws of hourly crisis meetings and making tough decisions about ambiguous situations is grueling work that I don’t wish upon anyone. But if done correctly it can move rapidly and deliver results that benefit users and strengthen the organization.
Responding well to a crisis can highlight the professional, ethical, and adaptive qualities of an organization’s leadership. Optimally, it will enhance customers’ trust in management’s commitment for secure products, respect for user privacy, honesty of its marketing, and designs that preserve a positive user experience. If done poorly, it becomes a protracted blight on an organization, its products, and leadership. Often careers and businesses don’t survive for long.
Zoom has numerous challenges to face. It has already done many things right, you can read the details in their blog and watch a video of CEO Eric Yuan openly discuss the issues and efforts, but has a long way to go before it restores trust and makes its products secure. Every organization should take a moment to understand what Zoom is going through as a learning opportunity and introspectively explore how they want to avoid or address the risks. Confidence in products and an organization is at stake.