The promise of the Big Data revolution is all around us – improving our quality of life and providing us with new insights in culture, science, economics, and healthcare. We can now use data to help us in various, diverse ways – including better predicting and preparing for the spread of infectious disease, saving premature babies by fine-tuning infant care, empowering consumers with a more transparent marketplace for airfare, and introducing the Moneyball theory to professional baseball.
Unlike traditional data collection that uses a statistical sampling to make predictions about the whole, the promise of Big Data is that you can now go beyond a small sampling. Historically, analysis was limited to testing out a hypothesis that was defined well before data collection occurred. That hypothesis informed us of what data should be collected and how much should be collected.
We assumed we knew what the main challenge was for which we were analyzing, but assumptions can be wrong. Now with the evolution of data analysis technologies, you allow the data itself to show correlations and answer questions you didn’t had the foresight to ask.
In recent years, the cost of data collection and retention has fallen astronomically. In fact, one gigabyte of data decreased from $10,000 in 1990 to $0.10 in 2010. Economic and technical innovations have removed the barrier to mass data collection, and societies are creating digital information in record amounts. In 2013, the US Executive Office estimated totals reaching four zettabytes. With that in mind, why shouldn’t a business archive all of its valuable data and store it indefinitely?
The dark side of Big Data
The problem with collecting and storing massive amounts of data is that it not only has inherent value, but it also comes with tremendous liability. Current policies established by in-house general council generally keep data only as long as it is necessary by law. Therefore in highly regulated industries, data is purged as soon as possible, preventing it from being used as part of a legal inquiry. For other non-regulated industries, many organizations have not revisited retention policies for decades and continue to handle data without considering its unseen value.
Then, there are businesses that provide goods and services in order to collect data as a commodity. Generally, any company offering consumers anything for free falls into this group; and include data brokers such as Experian, TransUnion, and Acxiom, as well as search and social tool companies like Facebook and Google. These companies take a data centric approach to their business and establish data as more valuable than anything else. Data retention policies at these organizations are dictated by an asset model of data, pushing retention periods well beyond historic periods. This can make these companies more vulnerable to public scrutiny, regulation, and legal action.
Ever-Changing Privacy Enforcement
The advent of big data is also changing the rules around privacy protection. As more incidents of privacy violations occur, the legal system tries to keep up by enforcing new laws to plug the security holes that Big Data finds.
In 2013, the Massachusetts Supreme Court ruled that requiring a customer to provide a zip code for a retail transaction is an unfair trade practice. The verdict categorized zip codes and geolocation data as Personal Identifiable Information (PII) and impacted the way companies handle this type of data.
Laws and best practices for handling protected data will continue to evolve, so making decisions around how an organization should generate and enforce policies will remain a challenge for years to come. When designing policies, organizations that want to take advantage of the Big Data revolution will need to consider that the rules will change after the data is collected and must find ways to design programs accordingly.
Balancing value and risk
Organizations are faced with the unanswered challenges of handling “big data”. There is no question that massive data sets are valuable to a business, and subsequently, could be valuable to society as a whole. Companies must balance the value and risk of any information they collect. There a number of precautions organizations can take, including:
Know your data: The first thing an institution can do to become responsible for its data is to understand what it is and where it is stored. Surprisingly, many organizations would love to get the benefits from Big Data but still have trouble collecting, categorizing, and tagging their data in a meaningful way that would allow them to take basic precautions. In many cases, it is because the organization does not have an adequate policy that determines what it wants to achieve with the data it collects. In other situations, the business hasn’t yet recognized the value of collected data, so it hasn’t funded the necessary initiatives necessary to truly understand the data it possesses.
Disclosure: Any institution should make sure it is disclosing the purpose of the data collection to the person or group from whom they are collecting it. In many cases of data breaches that led to fines from regulatory authorities, the offending organization was neither transparent nor forthright about its intentions. With the prevalence social media and crowd sourcing, institutions would be surprised at how willing customers are to share their data. Making sure those participants know what the data will be used for is critical to avoiding a reputational snafu.
Consent: Since Big Data promises to find correlation within data that cannot be predicted, consent on how the data will be used becomes tricky. Organizations should look for ways to broadly cover the usage of data that spans both time and purpose. This can be difficult when following consent guidelines stated in current laws, but if it is done in coordination with disclosure and thorough opt-out policies, companies should be able to create ethical models that fit within the guidelines.
In the near future, we will see the promise and consequences of this age of Big Data: from monumental advancements in understanding our world to privacy breaches when data isn’t managed ethically or securely. Does it mean your business should stick its head in the sand and avoid Big Data? Absolutely not. By taking measures to account for its data, disclose how you intend for it to be used, and receiving the proper consent, your business can protect itself and consumers alike from data breaches.