One of the topics I've often written and spoken about is thin provisioning. This series of 11 articles is an edited version of my thin provisioning presentation from Interop New York 2010. I hope you enjoy it!
One of the sticky wickets that holds back thin provisioning is the need to communicate when capacity is no longer needed. Enterprise storage arrays can reclaim zeroed pages, but writing all those zeros can really fill up an I/O queue. This is where WRITE_SAME comes into the picture.
This is a really terrible name. It’s all-capital letters and has an underscore in the middle of it. We sound like engineers.
But WRITE_SAME is an interesting idea: Imagine you wanted to delete a terabyte of data using a storage system with zero page reclaim? You’d have to write a terabyte of zeroes. Well, that’s a lot of IO. You’re basically pouring zeroes across your PCI bus, HBA,network, and array.
Instead, imagine we could just say, “You know that page of zeroes that I just wrote? Can you please write that a million more times for me? Hey, thanks a lot.”
You could do it in one command. That’s what WRITE_SAME is. It’s a SCSI command that says, “That last thing that I just wrote, can you please write it again, and again, and again? Can you please write it a thousand times? Can you please write it over here, over there?” I sound like Dr. Seuss: You can write it in a car. You can write it at the bar. You can write it on a bike. You can write it with a pike.
This conserves IO, and is a really good thing. WRITE_SAME makes zero page reclaim that much more effective. Now if only we had a system that would actually use this command!
It’s popular with array vendors, because all they have to do is say, “Hey look, I already support zero page reclaim. It’s up to you guys up there in the stack to implement the rest of this problem. It’s not our problem. It’s your problem.”
As an aside, consider that, if you’re an array vendor, any problem that reduces the use of disk capacity is your problem. So, they may not all be that eager to have this work, I think, but I’m sure they’ll come around.
But imagine if you did this to an un-thin array. Imagine if the array didn’t support zero page reclaim on ingest and instead was post-processing. You could end up writing a terabyte of zeros on the back end of your storage system, or 10 terabytes or 100 terabytes of data, only to reclaim it later that day, or later in the week or later in the month. And what if your system didn’t support it at all? Suddenly, you’re flooded with IO requests on the storage-array side. So, basically, you’re conserving IO across the host and the network, but you’re potentially generating massive IO on the storage side – which is kind of a problem.
So, there are some issues here with this as well. But, we’re getting there.