Zero Page Reclaim: Savior of Thin Provisioning?

January 4, 2011 By Stephen 4 Comments

In the previous post, I talked about how the Drobo uses metadata monitoring to solve the telephone game and make de-allocation possible. But that approach is challenging in complex enterprise environments. Instead, most enterprise arrays use a complex chain of semaphores to interpret signals from the connected hosts about the capacity that can be un-provisioned.

On the storage side, arrays can only use the information they have to de-allocate: The data that’s stored on them. They don’t know what application is using it, what file system it is. They don’t know anything at all.

But, somewhere along the line, someone had a big idea and said, “wait a second, what if we look for pages that are all zeros?” We’ll talk about pages a bit later, but for now, let’s talk about zeros. A zero is kind of a smoke signal coming up from over the hills that says, “there’s nothing valuable here.”

So the storage array watches for pages that are all zero and reclaims them. As protection against making a stupid mistake (what if you actually wanted to write all zeros?), anybody who asks for a page that has been reclaimed just gets all zeros back.

Most of the major vendors support this kind of zero page reclaim. This is good stuff. I don’t want to sound too critical of them because I appreciate them implementing at least this.

The problem is that there’s not a lot of ability to actually have those zeros be written. Almost no operating system writes zeros to deleted space. If they actually wrote pages of zeros, thin provisioning would work great.

So what do the storage vendors do? They come up with utilities that write zeros!

NetApp has SnapDrive, which zeros out empty space so that the Filer can go and recover that space. You run it whenever you want to run it. Eventually the storage array notices that you’ve zeroed out that space and it recovers it. Compellent and Symantec’s Veritas Storage Foundation have something like that, too. You can also force it using the SDelete command, and you can configure it using VMware ESX.

Zero page reclaim is pretty straightforward. It doesn’t take a lot of computing power – It’s not like you’re watching the file system for changes or anything. All you’re doing is occasionally going through and deleting pages full of zeros. So, you can post-process it, kind of like de-duplication.

There are quite a few issues with zero page reclaim, though:

Things aren’t writing zeros
Most of these implementations are page-based, which looks like a problem
Theoretically, this drives more IO through the system, not less

This last is the biggest problem, really. In most cases IO performance is a bigger issue than capacity in enterprise storage. If I could give you all the capacity you could possibly want or all the performance you could possibly want, most people would pick performance. It used to be capacity, but now it’s all about performance. If infrastructure folks could get one for free and had to pay for the other, they would definitely pay for performance.

And zero page reclaim, the way that it’s implemented with SDelete or with eagerzeroedthick, is driving tons of IO. Basically, a delete is the same as a write because you have to write all these zeros over the bus. But there’s a way around that, too. And that’s the topic for the next piece in this series.

You might also want to read these other posts...

Comments

Tom says

January 4, 2011 at 6:45 pm

Please comment where/how one can find how to use sdelete with ESX…and where/how/if one can do this de-allocation with an MSA 2012i G1?? Thank you, Tom
the storage anarchist says

January 5, 2011 at 12:50 pm

Interesting side note: 3PAR made a big deal about their custom ASIC that scans for zeros. On VMAX, we use the Tachyon chip instead of custom hardware to do line-speed zero detect.

How, you ask?

Well, we have the Tachyon create a T10-DIF for every received block, which is used to protect he integrity of the data all the way to the physical drive writes (and back). Checking for zeros thus requires only checking the DIF to see if it matches the known DIF af an all-zero block!

So, when it comes to zero page detect, VMAX don’t NEED no stinkin’ ASICs 🙂
Bill Plein says

January 5, 2011 at 8:57 pm

3PAR’s ASIC does much more than zero-detect, which is a feature that was un-used in the ASIC for years. By the way, isn’t the Tachyon FC chip considered an ASIC? So, you do need an ASIC.
sfoskett says

January 5, 2011 at 9:49 pm

That’s a really clever way to do it. I salute whoever had that bright idea!

GPS Time Rollover Failures Keep Happening (But They’re Almost Done)

This is week “1111111111” in the GPS system. Tomorrow morning it will roll over to week “0000000000”. How well will various systems handle this change? Not well, judging by what we’ve seen so far!

Ranting and Raving About the 2018 iPad Pro

I remain enthusiastic about the iPad Pro, despite getting a scratched screen and my concerns about durability. It’s a worthy successor to the original and offers enough improvements that I’d recommend the upgrade for just about anyone who uses their iPad for serious work. It’s still not yet a laptop replacement, but this is due more to a lack of desktop-class software for iOS than anything in Apple’s control.

Storage Changes in VMware vSphere 5

July 16, 2011

Once again, VMware added a ton of new storage enhancements to vSphere. With storage rapidly becoming the limiting factor in scalability and performance of virtual machine environments, this is no surprise. Also not surprising is the fact that major features like Policy-Driven Storage and Storage DRS (along with SIOC) are exclusive to “Enterprise Plus” licenses.

Scaling Storage In Conventional Arrays

November 19, 2013

It is amazing that something as simple-sounding as making an array get bigger can be so complex, yet scaling storage is notoriously difficult. Our storage protocols just werenâ€™t designed with scaling in mind, and they lack the flexibility needed to dynamically address multiple nodes. So my hat is off to these companies and others who have come up with clever ways to maintain compatibility while scaling out beyond the bounds of a single storage array.

The Rack Endgame: A New Storage Architecture For the Data Center

September 3, 2014

Top-of-rack flash and bottom-of-rack disk makes a ton of sense in a world of virtualized, distributed storage. It fits with enterprise paradigms yet delivers real architectural change that could “move the needle” in a way that no centralized shared storage system ever will. SAN and NAS aren’t going away immediately, but this new storage architecture will be an attractive next-generation direction!

Rocking Out With the Topping VX1 Desktop/Bookshelf Amplifier

October 6, 2015

A few months back, I asked folks on Twitter and LinkedIn for recommendations for a desktop amplifier for a pair of bookshelf speakers. I ended up with a Topping VX1, one of the many “Class-T” digital amps lauded by audiophiles for their excellent sound reproduction. Boy am I impressed! It’s rare that such an inexpensive gadget (around $100!) delivers so much performance!

My Visit to Bletchley Park

August 3, 2012

Bletchley Park is much more than a museum. For those interested in the history of computing, like me, or World War II buffs, it’s a must-see. The reconstructed Bombe and Colossus computer are truly breathtaking, and I find myself drawn to the stories of codebreaking that took place here. I highly recommend a visit!

A Complete List of VMware VAAI Primitives

November 10, 2011

VMwareâ€™s introduced the â€œvStorage APIs for Array Integrationâ€ (VAAI) in vSphere 4.1, and block-heads like me went nuts. Weâ€™ve been trying to integrate storage and servers for decades, and VMwareâ€™s APIs finally allowed this to work in truly seamless fashion. But the world of VAAI is a thicket of bizarre naming and puzzling functionality. Some VAAI primitives are ignored or even hidden! Letâ€™s take a look at the complete list.

Regarding My Symbolic Links and Good Reads

April 16, 2015

I’ve been told that my shares drive traffic to the blogs I read. I’m thrilled that I can share great writing with you in this way, and I hope you find it valuable! But just in case you don’t, I hope you’ll take advantage of the fact that I filter these posts for you into “Symbolic Links” and “Good Reads” so you’ll find it easier to ignore them.

vSphere 6: NFS 4.1 Finally Has a Use?

February 3, 2015

Way back in the 1990’s, UNIX admins delighted in upgrading from NFSv2 to NFSv3. Then NFSv4 came around and … crickets. Now VMware has become the first major/useful/mainstream application for NFSv4.1, so the floodgates are open! But are they?

You might also want to read these other posts...

Reader Interactions

Comments

Leave a Reply