I’ve long hollered that ZFS is a real storage revolution in the making, but recognized that it still had a way to go before replacing UFS, HFS+, and most volume managers. Well, a little Rhode Island company called greenBytes comes out of stealth today to announce that they’re doing just that – taking the solid ZFS core and adding some serious enterprise storage features to it. And they’re rolling the lot into a multi-protocol storage array using commodity (Sun Thumper) hardware. These guys have cooked up a seriously interesting entrant in the storage market, though I can’t say much for the decapitated camel-case spelling of their (already in use) name!
Although ZFS’ universal storage pool with non-RAID is a great concept, it stands in the way of at least one (sometimes) desirable storage technique: disk spin-down. Put simply, since every disk contains metadata, all disks must always be spinning. This issue is by no means a ZFS-only problem, though – certain vendors tout the (laughable) greenness of their storage systems, while hoping that the average user won’t notice the truth: That a disk simply cannot spin down while any part of it is in use. This means that tacking spin-down onto a regular storage array is like painting it a different color: There is no benefit whatsoever to the average user. Sure, a few non-provisioned drives might spin down, but what are you doing buying a lot of non-provisioned drives anyway?
The solution has always been right in front of everyone: Develop a new type of non-RAID with enough intelligence to allow drives to spin down when not used. This is what COPAN Systems did with their MAID technology: Invent an entirely new storage array, with integrated data protection and management techniques that allow alive but not active drives to spin down. Spin-down is not MAID any more than a bicycle is a Ducati.
Let’s make one thing clear: It’s really hard to reduce the power demands of storage devices. Disks guzzle watts like few other data center devices, and enterprise storage uses lots of disks. Lots of vendors are looking to hop onto the green storage bandwagon, and they all seem to realize that bringing some intelligence to power management by enabling spin-down is an open door. But it’s awfully hard to maintain performance and data protection when disks are spinning up and down all the time.
One element of the greenByte story is the way in which they have tweaked ZFS to allow disks to spin down. They limit the metadata updates to just a few disks, so the others can be idled when no access to them is made. The company suggests scheduling this for off hours to minimize latency as drives are brought back online, an approach that is less than optimal from an energy perspective but demonstrates that they understand just how difficult this problem is to crack. The core is there, however: They have integrated the data protection and storage management elements to enable spin-down to be practical.
Another major storage industry theme of the last few years is deduplication of data. An advanced (or devolved, depending on your perspective) form of compression, deduplication allows a storage array to store duplicate data more efficiently, reducing the amount of capacity required for some applications. Data Domain is top-of-mind in this space, but just about everyone now offers some form of deduplication technology.
One major roadblock on the way to deduplication (or compression) nirvana is performance. Simply put, it’s really really hard to process data on the fly without affecting performance, especially as data scales up to the multi-terabyte range or as systems scale out to include multiple devices. One approach to tackling this issue is post-processing dedupe, which accepts incoming data in the normal way but goes back and processes it later to remove duplicates. This is the method NetApp uses, and they have leveraged it to become the first vendor to support deduplication of production applications.
Predictably, deduplication is another technology integrated into greenBytes’ “ZFS+” technology. They claim that they can handle inline compression at wire speed, and also claim deduplication inline. It’s not yet clear exactly what the difference between compression and deduplication is to the company, or just what kind of performance their inline technology will yield, but it’s certainly nice to see this tech integrated with ZFS!
Thin is In (the House!)
greenBytes gets closer to enterprise storage bingo by adding thin provisioning to the mix. Actually, as the company’s CTO was quick to point out, they had to offer virtual or thin provisioning to enable the rest of the system to function. When your storage is sliced and diced by their Cypress array, the only way to present storage is with a wink and a promise of capacity to spare. Thankfully this is not the core of their pitch, however.
The company also promises snapshots and CDP replication, all leveraging ZFS at the core. All they need to add is tier-0 solid state storage to get five chips in a row without even using the free space! Although greenBytes is using Sun’s Thumper chassis currently for their Cypress array, their core technology is the ZFS+ software, and I expect we might see this mixed quite differently in the future. This is a software company, not an array vendor.
All considered, greenBytes has thoroughly broken the link between physical and logical storage, and I applaud them for it. This is exactly the kind of storage revolution the industry needs right now.