Processing and Scheduling Thin Provisioning

February 22, 2011 By Stephen 3 Comments

Although the core issues with thin provisioning revolve around communication, it presents unique challenges to the storage array as well. We talked about granularity of pages, and the comments for that piece were extremely enlightening. Now let’s consider another key factor: Scheduling.

Note that the “provisioning” part is relatively easy to do on the fly: An array just has to allocate additional capacity as writes come in, which is something it does anyway. It’s the thin reclamation that poses a challenge, since this involves zero detection across a whole page of data in many cases.

Just like de-duplication, thin provisioning challenges the resources of the storage array to do background number crunching. And just like dedupe, the array engineers have a choice of when to do the reclamation processing: Well after writing or “in-line”. The extreme ends of this spectrum fall into two equally disappointing categories: Wholly ineffective or ridiculously intensive.

Let’s start with the “intensive” side: You could have the controller do thin provisioning automatically; that’s kind of what IBM does with SVC, for example, and 3PAR claims to do this too. The trouble is that the controller has to literally watch everything, and it’s got to reassemble whole pages, perhaps 42 MB or even one GB in cache. If it didn’t have all that data, it would have to go fetch it, put it into cache, look at it, make sure it was all zeros, then get rid of it. It’s really, really difficult to do automatic, in-line, thin provisioning. It’s a good thing to do, but it’s a hard thing to do.

So most vendors schedule thinning for later. In the “10 terabytes of zeros” example, they’re actually going to write 10 terabytes to disk, or at least through to cache. Then, at some point in the future, they’ll go back and reclaim that space. Some are pretty aggressive and reclaim capacity very frequently. Others are fairly lazy: The Drobo seems to reclaim only once or twice a day. A lot of people who have them are surprised when the thing springs to life and starts going, “Bada-bada-bada-bada-bada-bada.” Apparently it’s reclaiming storage at that time.

Some thin provisioning systems are even manually-initiated, and this is really pretty ineffective. The storage administrator has better things to do than reclaim storage all the time, so they are probably going to set a cron job to do it regularly at a specified time. If the system only does it on demand, that means that it doesn’t have the horsepower to do it automatically. Ergo, it’s sometimes going to conflict with “real work” and cause a problem.

I would look for a system that was fairly aggressive with thin reclamation. I was talking to the guys at Nimbus Data, for example, and they claim to do thin provisioning in-line all the time. I hope that we see more storage arrays that are doing that, and less that are doing it manually, on demand, because that’s just not as useful.

But considering that thin provisioning used to be almost useless, the fact that it’s now at least somewhat useful is gratifying.

You might also want to read these other posts...

Comments

Anthony Vandewerdt says

February 23, 2011 at 9:08 pm

Hi Steve, great post (as always).
Your correct in that the IBM SVC (model CF8) and Storwize V7000 does ‘zero detect’ on write (at point of ingress). This is possible when you have plenty of CPU power and fast memory throughput.
It also does zero detect if you want to create a volume copy (if you want the secondary to be thin provisioned). This is great for converting thick to thin on the fly.

The IBM XIV also does zero detect on the fly during Migrations (when we are pulling data off old storage and moving it into the XIV) and during replication (it doesn’t send zeros to its mirror partner). It also does zero detect during scrubbing (the process that runs to ensure data is confirmed to be readable and have good ‘parity’), to ensure no empty blocks get reported as used space. The scrubbing process runs constantly working its way through the entire machine over the course of several days.
sfoskett says

February 23, 2011 at 10:37 pm

Glad to have the confirmation about SVC. Thin on the fly is really an unusual feature, something that surprised me during my research.

And thanks about XIV too. Good to know.
Basil says

February 23, 2011 at 10:56 pm

ThP on the fly is one of the impressive features of 3PAR, it can be done without almost any impact due to specialized ASIC with zero-detection in silicon. So, with 3PAR you’ll get Thinly Provisioned writes, migrations, replications, physical copies. Also you’ll get deep integration with a number of reclamation/ThP frameworks in Oracle ASM, Veritas API, VMWare, etc.
And all of this with the 16K blocks!
Yes, I’m really impressed by the 3PAR ThP technologies:)

GPS Time Rollover Failures Keep Happening (But They’re Almost Done)

This is week “1111111111” in the GPS system. Tomorrow morning it will roll over to week “0000000000”. How well will various systems handle this change? Not well, judging by what we’ve seen so far!

The 2018 iPad Pro is a Beast!

The third-generation iPad Pro is a great machine but also a bellwether of change at Apple. It will be very hard for the rest of the mobile and client computing industry to keep up with this kind of progress!

Replacing Google Reader With Feedbin and Reeder

May 5, 2013

I am an avid Google Reader user, so I’m thoroughly annoyed by Google’s decision to kill it as of July 1. But there’s no stopping the tide, so I’ve made the move to Feedbin as a Reader replacement as of today. It’s a slick, snappy web application with a committed developer and, critically, support for Reeder, my favorite offline RSS reading application. Let’s hope this works!

10 Mysteries The Lost Finale Definitively Settled

May 25, 2010

The series finale of Lost didn’t settle every question, but it did settle many of the long-running questions raised by fans. Although my live viewing was frustratingly complicated by failed transmission equipment at ABC affiliate, WEWS, I was able to watch the entire episode thanks to iTunes. So let’s settle the things that can be settled regarding Lost.

Infographic: Real-World Port Throughput Relative To Thunderbolt (Formerly Light Peak)

February 21, 2011

Just how fast is 10 gigabits per second anyway? To help out, I’ve prepared another napkin-tastic infographic!

The Rack Endgame: Open Compute Project

September 17, 2014

On reading my thoughts about the evolution of enterprise storage, many pointed out that this looks an awful lot like the Facebook-led Open Compute Project (OCP). This is entirely intentional. But OCP is simply one expression of this new architecture, and perhaps not the best one for the enterprise.

Edward Snowden Is Right: We Must Protect The Internet

March 19, 2014

Edward Snowden, NSA whistleblower, appeared at TED2014. His video, embedded below, must be watched. You may think he’s a hero or you may think he’s a villain, but he’s unequivocally right about one thing: We must protect the integrity of the Internet and online communications or we risk disrupting the world economy and all of our lives.

Free as in Coffee – Thoughts on the State of OpenStack

May 2, 2016

Last week I headed to Austin, Texas to attend the semi-annual OpenStack Summit there. Along with the usual socializing, I was looking to understand the current state of the technology: What does OpenStack really mean these days, and where is it going? Let’s start with “free”. As “the Internet” is quick to point out, this critical word has multiple […]

What is VMware VASA? Not Much (Yet)

November 11, 2011

VMware is adding storage integration features to their flagship vSphere server virtualization product line at a rapid pace. From backup to enterprise array offload, VMware is staking their claim. But information about one new storage feature in vSphere 5 has been scarce: The true nature of the Storage API for Storage Awareness (VASA) is only just beginning to be revealed.

FCoE vs. iSCSI – Making the Choice

May 20, 2011

iSCSI is an excellent choice in situations where Fibre Channel investment is nonexistent or badly in need of wholesale upgrade. FCoE, on the other hand, is likely to take over in high-end enterprise shops. It is relentlessly promoted by major vendors, and it seems that they will force the upgrade eventually.

You might also want to read these other posts...

Reader Interactions

Comments

Leave a Reply