One of the topics I've often written and spoken about is thin provisioning. This series of 11 articles is an edited version of my thin provisioning presentation from Interop New York 2010. I hope you enjoy it!
Although the core issues with thin provisioning revolve around communication, it presents unique challenges to the storage array as well. We talked about granularity of pages, and the comments for that piece were extremely enlightening. Now let’s consider another key factor: Scheduling.
Note that the “provisioning” part is relatively easy to do on the fly: An array just has to allocate additional capacity as writes come in, which is something it does anyway. It’s the thin reclamation that poses a challenge, since this involves zero detection across a whole page of data in many cases.
Just like de-duplication, thin provisioning challenges the resources of the storage array to do background number crunching. And just like dedupe, the array engineers have a choice of when to do the reclamation processing: Well after writing or “in-line”. The extreme ends of this spectrum fall into two equally disappointing categories: Wholly ineffective or ridiculously intensive.
Let’s start with the “intensive” side: You could have the controller do thin provisioning automatically; that’s kind of what IBM does with SVC, for example, and 3PAR claims to do this too. The trouble is that the controller has to literally watch everything, and it’s got to reassemble whole pages, perhaps 42 MB or even one GB in cache. If it didn’t have all that data, it would have to go fetch it, put it into cache, look at it, make sure it was all zeros, then get rid of it. It’s really, really difficult to do automatic, in-line, thin provisioning. It’s a good thing to do, but it’s a hard thing to do.
So most vendors schedule thinning for later. In the “10 terabytes of zeros” example, they’re actually going to write 10 terabytes to disk, or at least through to cache. Then, at some point in the future, they’ll go back and reclaim that space. Some are pretty aggressive and reclaim capacity very frequently. Others are fairly lazy: The Drobo seems to reclaim only once or twice a day. A lot of people who have them are surprised when the thing springs to life and starts going, “Bada-bada-bada-bada-bada-bada.” Apparently it’s reclaiming storage at that time.
Some thin provisioning systems are even manually-initiated, and this is really pretty ineffective. The storage administrator has better things to do than reclaim storage all the time, so they are probably going to set a cron job to do it regularly at a specified time. If the system only does it on demand, that means that it doesn’t have the horsepower to do it automatically. Ergo, it’s sometimes going to conflict with “real work” and cause a problem.
I would look for a system that was fairly aggressive with thin reclamation. I was talking to the guys at Nimbus Data, for example, and they claim to do thin provisioning in-line all the time. I hope that we see more storage arrays that are doing that, and less that are doing it manually, on demand, because that’s just not as useful.
But considering that thin provisioning used to be almost useless, the fact that it’s now at least somewhat useful is gratifying.