When was the last time you deleted data? Even at home, where we have autonomy and authority over our own data, many of us are digital pack rats. But at work? Never! No one ever deletes anything! Let’s talk about why this is.
Retention vs. Deletion
Just about everything we do in IT infrastructure is focused on retention. We back up our data and implement other data protection tools like snapshots and mirrors. We might also archive data so that the General Counsel can place legal hold on it, as well as perform data discovery during litigation. And then there’s the whole field of data security, focused on locking people out of data, keeping it intact and un-viewed.
But what about deletion? Almost no effort is put towards removing data, though the rapid growth of storage might lead one to think this is a key area for IT. We certainly could put some effort on revision control, and especially deleting drafts and outdated data. We could easily expire content that was no longer needed, if only we had some way to know that. And we’ve talked a lot about secure deletion, even though we hardly ever actually perform that task except when moving to new physical storage hardware.
The greatest challenge for deletion is a simple question: What should we delete and when?
IT can not answer these questions. They must be put to the business people who really own the data. Without permission and buy-in, IT is in serious legal peril when it comes to deleting data: Any deletion must be in accordance with policy and must be legal, that is there is no legal or regulatory hold on it. And there is no way most IT staff feel empowered to do that!
Some Data Should Be Deleted
Certainly, not all data should be saved. There is “low-hanging fruit” in every storage estate that can and should be deleted:
- Ephemeral copies – Drafts, temporary data, working copies
- Time-limited projects — Third-party or client data, test and development
- Expired data — Retention policies that are expired and no legal hold remains
- Legally required – Data that isn’t yours, or that legal demands deleted
Tackling these data sets is much easier to tackle than cleaning out primary data stores, since it doesn’t require as much sifting and sorting: These data sets can often be identified programmatically! If you have data sets like these, this is the ideal place to start a deletion effort.
Delete on Demand
Regardless of the type, however, IT should not delete data without direction. It is perilous in today’s legal environment to destroy data without a policy directing that action. So we should continue to focus on retention for most data, while we work with legal to determine which data can be deleted and come up with a process for approval.
But it’s important to start offer a deletion-friendly environment for certain data types. Such a storage system would reduce the difficulties associated with data deletion. Really, only an integrated solution can truly delete data:
- It must maintain custody of data from start to end and not allow it to leak all over the organization
- It must be accessible since any restrictions tempt users to create “working copies”, thus thwarting deletion
- It must be secure — Data must always be encrypted to avoid remnants on media
- It must be protected so data will not spread to external systems and sites
Data deletion is a real problem for most IT shops. I’m just getting my head around the ramifications, and continue to look for an ideal deletion-friendly storage solution.
If you’re interested in the topic of data deletion, I recommend joining me for a webinar on the topic on Wednesday, April 13. Sponsored by Nasuni, I will discuss the dilemma of deletion and CEO Andres Rodriguez will weigh in about the capabilities of his cloud storage solution. Register now!
Note: Nasuni is sponsoring this webinar, but the content was created by me. This blog post is intended to engage my audience in discussion of the subject, and is not a paid promotion or advertisement.
Image credit: “Delete” by blmurch