Is differential privacy the ideal privacy-enhancing computation technique for your business?
As security & risk management (SRM) leaders globally adjust to a “new normal” brought about by the COVID-19 pandemic, businesses must adapt their privacy programs for better scale and performance on ever-tighter budgets. While juggling these competing constraints, one thing that is not negotiable is the reputation risk caused by privacy incidents or data breaches.
All digital operations involving personal identifiable information (PII) information have to put in place plans to address two asks: customer expectation that their data is safe and increasing breadth and depth of privacy regulations globally.
Gartner predicts that, by 2023, organizations embedding privacy user experience into the customer experience will enjoy far more trustworthiness and up to 20% more digital revenue than those that don’t. It’s simple: Customers who trust more spend more.
Businesses that want to achieve such lofty goals must look at both internal and external data sources to merge and mine them for insights. All of this requires sensitive personal data for data monetization, fraud analytics, and business intelligence purposes. Let’s explore some of the challenges and opportunities of integrating privacy-enhancing computation capabilities into operations, with special attention on differential privacy.
What is privacy-enhancing computation?
Privacy-enhancing computation is a new breed of technology that enables businesses to process, analyze, and share data without having to expose the underlying data (i.e., proprietary information/sensitive data) or related algorithms. These techniques come in handy when third-party service providers can add value to the business operations, but the organization is not willing to share access to in-house data.
Gartner considers privacy-enhancing computation to be one of the top strategic technology trends for 2021. By 2025, 50% of large organizations will adopt it for processing data in untrusted environments and multi-party data analytics use cases.
The implementation of privacy-enhancing computation techniques may take a combination of the following approaches:
- Perform analysis on local data without disclosing its actual location
- Provide a trusted environment to perform analysis by external partners
- Transform data before any analysis is performed on it
The collation and storage of such large volumes of data are essential for developing solutions at scale. However, it creates two challenges for enterprises:
- Stored data is vulnerable to data breaches
- The data is often shared across untrusted environments (such as third-party cloud partners) or used for multi-party analytics solutions
Today, a wide range of new privacy-enhancing computation techniques are emerging. Many of these are being developed and tested specifically by large enterprises. However, due to increasing legal liabilities and a greater need to build customer trust, organizations do not want to share their information outside their internal operations.
This creates a few additional problems. First, businesses miss out on opportunities to monetize the large trove of value locked in data. Second, businesses cannot benefit from innovation happening outside the organization that can pay rich dividends if integrated into their operations. And lastly, it’s unclear who owns the intellectual property rights for the process. With ambiguous norms, there are no clear legal frameworks for data sharing today.
Still, innovation is working hard to eradicate these problems. The following techniques are commercially available for deployment:
- Differential privacy
- Homomorphic encryption
- Secure multi-party computation
- Zero-knowledge proof
- Private set intersection
Differential privacy is gaining considerable traction across industries to solve the data sharing or intellectual property protection problems. It involves sharing information about a dataset while withholding or distorting certain information about individuals or features in the dataset. The system relies on mathematical algorithms to insert noise into the dataset while ensuring the resulting analysis does not significantly deviate because of the noise insertion.
Even if hackers gain access to data with potentially damaging individual information, differential privacy will make it impossible for bad actors to reverse engineer those data elements and tie them to individuals. It minimizes the risk of individual data compromise even when the source data itself is compromised.
Who should consider differential privacy?
B2C and B2B organizations increasingly recognize the value in the volumes of information stashed across enterprise systems. The same value makes them a target of hackers and oversight by regulators: The monetary liabilities and regulatory oversight from the leak of any such data are significant.
Companies holding sensitive data with PII should explore differential privacy systems to reduce any potential impact when such sensitive data gets exposed. Any business function dealing with real-time high-performance data processing that relies on high levels of AI model accuracy should consider differential privacy along with other privacy-preserving technologies.
Still, the implementation of these methods needs a lot of data and sustained investment. Given the nascent stage of these technologies, their long-term viability needs constant measure. Scaling the deployment of these solutions requires education and strategic planning across functions within the organization. Moreover, businesses should be open to the idea of potential hacking attempts against these implementations. This would mean continuous monitoring of their performance and periodic fine-tuning and adjusting to the changing nature of data.
Examples of differential privacy in action
Differential privacy has been gaining traction in tech and pharmaceutical companies, as well as governmental entities:
1. The US Census Bureau uses differential privacy before publishing population reports. The organization collects a ton of PII as part of the once-a-decade exercise. The data collected is relied upon by Federal and State governments to plan their budgets and programs. The use of differential privacy allows the US Census Bureau to share contextual information, at granular levels, without compromising the privacy of citizens.
2. Google used differential privacy when it published its mobility reports visualizing movement patterns in different geographies during the COVID-19 crisis. When the debate raged around lockdowns globally, Google used the data captured on Android devices in cities like New York to monitor people’s movement. When this information was made public, the company made efforts not to share individual, device-level data. Differential privacy ensured that people’s movement was captured accurately without disclosing who was being tracked.
3. Pharma manufacturing companies collect a wealth of information as part of clinical studies, including PII on participants in drug trials and health indicators over time. Such information is crucial for demonstrating the efficacy of the drug being developed. As this information must be shared with regulators like the FDA or EMA, techniques like differential privacy are adopted to reduce the risk of re-identifying an individual by any third party.
In conjunction with other privacy-enhancing technologies, differential privacy can enable businesses across highly regulated industries to safely tap into their most sensitive data. As the potential impact of this system is high, upfront investment to incorporate best practices is better not left to businesses with deeper pockets.