Why do we care about thin provisioning? Because storage is not getting cheaper. If you went to buy a disk ten years ago, you’re going to spend about the same as would today, but you’re going to get a lot more capacity – a lot more capacity! The fact that we have terrible utilization of enterprise resources is really not helping us, and it’s not getting any better. It hasn’t improved because they are “doing storage” the same way.
Thin provisioning doesn’t take on the cost of capacity, it actually attacks the overhead of inefficient provisioning. Not all of that overhead is inefficiency, and not all of that can be tackled with thin provisioning. But some of it can. It’s a lot more of the cost than can be tackled by moving to SATA, for example. That I really like.
One of the biggest problems for thin provisioning is not the provisioning part: It’s fairly easy for a storage array to allocate on request: “I need a block; here’s some data I want you to write.” And the storage array just starts allocating, and allocating. But, the operating system never goes back and says “I don’t need that block anymore.”
What happens in the telephone game is that a little bit of information gets lost at each step along the path, and at the end of the chain you’ve basically lost all the information. And this happens all the time in computers, especially in data storage. Thin reclamation is the core technical challenge to thin provisioning, and the telephone game is the reason.
I began by introducing the core problem: Storage isn’t getting any cheaper due to storage utilization and provisioning problems. Thin provisioning isn’t all it’s cracked up to be, since the telephone game makes de-allocation a challenge. So now let’s talk about how to make thin provisioning actually work.
On the storage side, arrays can only use the information they have to deallocate: The data that’s stored on them. They don’t know what application is using it, what file system it is. But, somewhere along the line, someone had a big idea and said, “wait a second, what if we look for pages that are all zeros?” We’ll talk about pages a bit later, but for now, let’s talk about zeros. A zero is kind of a smoke signal coming up from over the hills that says, “there’s nothing valuable here.”
One of the sticky wickets that holds back thin provisioning is the need to communicate when capacity is no longer needed. Enterprise storage arrays can reclaim zeroed pages, but writing all those zeros can really fill up an I/O queue. This is where WRITE_SAME comes into the picture.
Thin provisioning needs communication to function, and zero page reclaim is only the array side of the story. WRITE_SAME helps reduce I/O load, but the server needs to use it. Wouldn’t it be nice if the operating system, file system, or volume manager would use these commands to help recover capacity?
If WRITE_SAME can be a semaphore for thin un-provisioning, what about TRIM? It sounds like a perfect fit, and has wider implementation to boot! Let’s take a deeper look.
Although I consider it the main stumbling block for thin provisioning, communication (or lack thereof) is being addressed with metadata monitoring, WRITE_SAME, the Veritas Thin API, and other ideas. But communication isn’t the only issue. Let’s talk about page sizes. You’ll often see vendors tossing this “softball” objection at their competitors, claiming that their (smaller) page size makes for more-effective thin provisioning. And that’s true, to a some extent.