How to Marie Kondo your data

+ Watch the recorded webinar: Inside a Docker Cryptojacking Exploit

By now you’ve heard about Marie Kondo, the author of New York Times bestseller, The Life Changing Magic of Tidying Up, and star of Tidying Up, the new Netflix show that puts her principles of organization and decluttering into practice in family homes throughout Los Angeles.

While the #KonMariMethod has put households across America in an organizing frenzy, we found that her tidying principles can also be applied to solve a core challenge for the business world: too much data.

Businesses ingest enormous amounts of personal data, every day. Sometimes this data is critical for business operations (e.g., user behavior), human resources (e.g., hours worked or pay accrued) or generating revenue (e.g., new users), but oftentimes, it’s not.

Chances are, there are countless data records stored in different internal databases or third-party systems that hold no business utility for your company. But unlike a drawer full of mismatched socks, excessive personal data can carry liability, and risk for businesses that continue to house it.

New data protection regulations, like the European Union’s General Data Protection Regulation (GDPR), and the upcoming California Consumer Protection Act (CCPA) are introducing new standards for how personal data is processed by companies. In most cases, businesses need explicit consent from users that collecting their personal data is OK. Otherwise, the burden is on your business to prove that your business interests override their data privacy rights.

What would Marie do? Minimize your data

The path to compliance with data protection laws always begins in the same place: data inventory and data minimization. We’ve adapted some of Marie Kondo’s principles to the process of organizing your company’s personal data.

Your goal with this exercise is to determine the data you need and delete the rest. The phrase “data minimization” appears throughout GDPR, and is a good practice for any business that aims for good data governance. By reducing your data stores to include only that which is essential, the risk of exposing sensitive data or missing an important data record when fulfilling a data request is dramatically reduced.

We begin by putting all of your data in one place.

If you’ve watched Tidying Up, you know the moment of reckoning with clutter starts early on, when her clients pile every piece of clothing they own on their bed, and are forced to face the reality of their closets. Now imagine if you could open the doors to your systems and databases and pile all of that personal data in one place?

Each company must take the time to collect all the personal data in their data stores so they can begin sorting through the clutter. Until your company is able to truly visualize a data inventory, it is impossible to optimize data processing, which is fundamental to complying with data protection laws like GDPR.

Once the data is in one place and you’re ready to begin a data minimization exercise in earnest, you will sort by category, not by location. This may feel counterintuitive at first, because companies often think of their data in terms of the systems where records are stored. It may be tempting to start by purging extra data from one database at a time. But just like ancient tubes of Chapstick live on your desk and nightstand, business often store duplicate data records in different systems, because the information could be useful to different teams for different purposes. When personal data is duplicated and dispersed throughout a number of databases, the risk increases for your company. Aggregating those data records is complicated, but critical for data protection.

Start with one category at a time, and discard all at once. Until your company is able to look at data by category, it is impossible to truly see the scope of personal data and understand your risk profile. One or two tubes of Chapstick in different places around the house may seem reasonable, but it takes putting all your Chapstick in the same place to realize you have 15 tubes scattered throughout the house. Similarly, if your siloed teams check their respective databases and see roughly 30 expired credit card numbers in each, the scale of the problem is less apparent than when you see 210 expired credit card numbers are stored across all the databases.

The impulse to begin a data minimization project by removing different categories of data from one database at a time is instinctual, but it often obscures the scale of clutter and in the end is a circular endeavour. Chances are high that the same categories of data lives in multiple databases, and you’ll be forced to revisit the same location countless times trying to delete different categories of data later on.

Do I need this data?

In Tidying Up, you evaluate each individual piece of clothing, piece by piece. If the item “sparks joy,” it can stay. If it does not spark joy, it goes. Looking at each data record individually is unrealistic, but the spirit is the same. Focus on the data that your business needs to keep, then delete or anonymize the rest. A good place to start is the law itself — audit the data you collect and if you can’t justify any individual category, then you have an obligation to delete that data. For the data you do keep, make sure that the data was collected with explicit consent, and is compliant with your regulatory obligations. When reviewing data records with your team, ask “do we need this data?”. If not, remove it.

Ideally, the big push for organizing your company’s personal data stores only happens once. In order to maintain data stores that are organized and compliant, establish transparent data collection policies with the public, and clear data retention policies internally.

Define rules around what categories of data are collected and stored, and for how long the data is stored. No personal data record should be stored indefinitely.

The stakes are high

Processing personal data is riskier than ever now that GDPR has come into effect. France fined Google $57 million for violating GDPR, and many other tech giants face similar complaints. Companies like Google and Amazon will survive the steep fines levied by authorities, but growth-stage enterprises might not. The authorities are not going to let violations slide, making the risk of a penalty very real. Tidying up your data is essential to compliance and the health of your business and ought to be a top priority for your business this year.