The Four Horsemen of Storage System Performance: Get Smart

February 13, 2012 By Stephen 3 Comments

Four Horsemen-400 — The Four Horsemen of Storage System Performance: These four ugly gentlemen stand between you and your data.

Why do some data storage solutions perform better than others? What tradeoffs are made for economy and how do they affect the system as a whole? These questions can be puzzling, but there are core truths that are difficult to avoid. Mechanical disk drives can only move a certain amount of data. RAM caching can improve performance, but only until it runs out. I/O channels can be overwhelmed with data. And above all, a system must be smart to maximize the potential of these components. These are the four horsemen of storage system performance, and they cannot be denied.

A Lack of Intelligence

Disks can be made faster (and more added), solid-state storage and cache can be added, and I/O bottlenecks can be removed, but what then? How can storage performance keep up with Moore’s Law over the decades? The answer is intelligence: Storage systems must adapt and tune themselves to changing workloads.

It’s far simpler to slap the label “intelligent” on the storage system than it is to add real smarts to the box. The biggest hurdle has always been a lack of communication between clients and applications (at the extreme top of the stack) and storage devices (at the extreme bottom). I’ve called virtualization “a stack of lies”, and in many ways that’s exactly what it is. At each point in the I/O chain, information is lost that would have helped a real intelligent storage array to make better decisions.

Gettysburg Storage — Your disk doesn't contain anything resembling your files

Consider a very simple case: Your laptop. It probably contains a SATA hard disk drive connected to a basic controller on the PCIe bus addressed by the CPU. An operating system (probably Windows or Mac OS X) runs on the system, and it relies on a file system (NTFS or HFS+, respectively) to organize and access the hard disk drive. But it also has a volume manager (currently unnamed by Microsoft, though Apple internally calls theirs CoreStorage) that virtualizes storage and adds features like encryption and compression. The files seen by the operating system pass through “filter drivers”, then the file system (which chopped them into blocks), the volume manager (which organizes these blocks), the laptop’s SATA controller, the disk drive’s own controller (which decides where to place these blocks) and cache, and finally to the magnetic media. Even in this very simple scenario, the operating system literally has no idea where data is stored, and the disk literally has no idea what it is storing.

But applications don’t really “care” about files. Each application has its own semantics for storage and retrieval of data, and the file is simply a universal and convenient metaphor for application data storage. Most applications use a proprietary container format which includes metadata and scratch data along with the actual content. The characteristic pattern of reads and writes to this subfile information varies widely by application. This is why a storage device that excels for video editing may be totally inappropriate for databases or e-mail storage.

Enterprise servers add more layers of translation, with Fibre Channel HBA’s, network switches, redundant RAID controllers, and separate caches all performing their magic and discarding valuable meta-information. Many enterprise systems also include independent caching devices in the server, network, or as a gateway to the storage array. Everything in the stack is valuable in one way or another, adding reliability, recoverability, and performance. But the machinations of the stack obscure what goes on above, blocking the ability to add intelligence to the array.

Higher-level applications and server virtualization further obfuscate the storage stack. An operating system may run only a small component of a large enterprise application, so related I/O may come from multiple directions at once. And each operating system may run on a virtual machine, with a hypervisor adding its own file system, volume manager, and storage abstractions. This so-called “I/O blender” purÃ©es and randomizes all storage access before it gets anywhere near the array.

De-Multiplex and Communicate

IO Blender — We need a communications channel that bypasses the "I/O blender"

The only way truly to add intelligence to a storage system, from a lowly hard drive to high-end enterprise array, is to de-multiplex data and add a communications channel through the stack. If the array can untangle the randomized I/O coming from above, and can accept and act on information about that data stream, many things become possible.

Data layout is an often-overlooked topic, but can have a massive impact on system performance. As we pointed out when discussing spindles, the physical placement of data on a disk can have a dramatic impact on I/O performance. But data placement is also critical for RAID systems and those that use automated tiered storage. Depending on system parameters, it may be better to keep data “together” or “apart” to improve performance, but this cannot be accomplished unless the array “knows” which I/O blocks belong together.

As discussed previously, pre-fetch caching can be extremely valuable to accelerate I/O performance. But pre-fetching information is almost impossible on the wrong side of the I/O blender. If an array could de-multiplex the data stream and tag each access by application, pre-fetch algorithms could be much more effective. An array could even work with a cache in the network or the server to pre-fill buffers with the data that would be needed next.

A storage system that intelligently manages caches all through the I/O chain is something of a Holy Grail in enterprise storage. Time and again, pundits and system architects have suggested moving data closer to the CPU to improve performance. At the same time, others recommend maintaining a distance to improve manageability, availability, and flexibility. Intelligently managing a set of caches in multiple locations is the ideal solution, but the inherent obfuscation of the current I/O paradigm makes this extremely difficult.

Stephen’s Stance

The Four Horsemen of storage system performance cannot be denied, but they do offer a clear path forward. Storage systems must improve in many different areas, from spindles and drives to caching and I/O bottlenecks. But above all else, storage systems must become smarter in order to become faster, and this requires greater insight into the true nature of the data stream being stored. All storage performance developments, from the laptop to the enterprise, boiled down to adaptations to the demands of the Four Horsemen.

You might also want to read these other posts...

Comments

Gregg Holzrichter says

February 13, 2012 at 6:07 pm

Great series of posts! One concept you highlight – “If the array can untangle the randomized I/O coming from above, and can accept and act on information about that data stream, many things become possible.” – made me think of the way Virsto (www.virsto.com) works. Their server-side software intercepts the random I/O from each VM, while maintaining a virtual machine level map to intelligently place (deduped) data on the array. Speeds up performance and allows you to get much more out of existing storage capacity. Just entered the VMware space – were just shipping on Hyper-V for past year.
Luciano Dalle Ore says

February 13, 2012 at 5:33 pm

Right on! A couple of thoughts on the subject:

1.
I spoke to Greg Ganger (CMU parallel data lab) on
this subject last year in Portland at the Hot Storage meeting, and he mentioned
that they had done some work on this very subject a few years back but could
not get any traction, as the players did not seem to be interested at the time.
My best guess is that the reason this is so difficult is that it would require
changes throughout the stack before a complete solution is implemented.

2.
Interestingly enough, this is where NAS can have
some advantages over SANs as NFS packets have a lot more information than raw SAN
blocks. I would also expect that an integrated NAS server would be able to do a
better job in “not blending” the information before it gets to the disks.
sfoskett says

February 13, 2012 at 5:36 pm

Higher-level protocols have an advantage when it comes to defeating the IO Blender, but mechanisms like VASA/VADM from VMware are probably the long-term “solution” we’ll see implemented. Then there’s cloud and object storage protocols, which skirt the issue entirely!

Thanks for the comments!

GPS Time Rollover Failures Keep Happening (But They’re Almost Done)

This is week “1111111111” in the GPS system. Tomorrow morning it will roll over to week “0000000000”. How well will various systems handle this change? Not well, judging by what we’ve seen so far!

Ranting and Raving About the 2018 iPad Pro

I remain enthusiastic about the iPad Pro, despite getting a scratched screen and my concerns about durability. It’s a worthy successor to the original and offers enough improvements that I’d recommend the upgrade for just about anyone who uses their iPad for serious work. It’s still not yet a laptop replacement, but this is due more to a lack of desktop-class software for iOS than anything in Apple’s control.

A Complete List of VMware VAAI Primitives

November 10, 2011

VMwareâ€™s introduced the â€œvStorage APIs for Array Integrationâ€ (VAAI) in vSphere 4.1, and block-heads like me went nuts. Weâ€™ve been trying to integrate storage and servers for decades, and VMwareâ€™s APIs finally allowed this to work in truly seamless fashion. But the world of VAAI is a thicket of bizarre naming and puzzling functionality. Some VAAI primitives are ignored or even hidden! Letâ€™s take a look at the complete list.

Preserving Your Credibility Is Your Prime Directive

June 4, 2012

I hope this post isn’t too “out in left field” but I thought it needed to be said. Independent social media has evolved into a powerful mechanism to influence belief, behavior, and (yes) buying. I take my little dollop of influence very seriously, and feel an incredible responsibility to live up to the trust placed in me by others. I will try every day not to let you all down!

What More Could Alan Turing Have Accomplished?

October 7, 2012

Many of you have probably heard the name of Alan Turing, but most of those probably don’t appreciate the extent of his contributions. To say that he invented the modern world is an overstatement, but he did dream up the computers we see around us today, and helped win World War II in the process. But the story of Alan Turing is as much about exclusion and defeat as it is of genius.

How Smart Is the Mondaine Helvetica Smart Watch?

December 30, 2015

I love watches and technology, so I was thrilled to hear about the creation of a “horological smart watch” base by the Swiss watchmaking industry. One of the first examples of this new breed is the just-released Mondaine Helvetica Smart. I purchased one of these watches, the limited-edition “1 of 1957” variety, and have had a chance to evaluate it both as a watch and a gadget.

Marketers: Fudging the Meaning of Buzzwords Matters (To You!)

December 2, 2015

Too many marketers and salespeople play fast and loose with words, but they’re only hurting themselves. Improper usage is embarrassing and causes a loss of credibility with the people they most want to reach. It would be wise to spend a lot more time being correct and a little less time jumping on bandwagons and buzzwords!

Scaling Storage At The Client

November 25, 2013

Scaling storage is a serious challenge for the industry, but there is a great deal of thought, effort, and creativity going into it right now. Companies like Gridstore, Oxygen Cloud, and Cleversafe have come up with effective client-side solutions to enable scale-out storage to sing. If you’ve got an appropriate application, client, or gateway, scale-out is a real possibility!

Follow the Yellow Brick Road to the Software-Defined Future

November 29, 2012

The Software-Defined Datacenter is a great concept, but it just won’t work. The big enterprise companies will never allow VMware (and daddy EMC) to commoditize them out of existence, so useful implementations will be rarer than ruby slippers. The best we can hope for is point enhancements to enable greater virtual machine mobility through SDN and improved storage integration.

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

We live in a world of cattle, not pets, and Kubernetes rules the roost. I’ve been meaning to spend some time getting up to speed on the latest but didn’t have enough hardware to make that happen until now. I recently bought a whole pile of surplus hardware so I will be able to experiment with orchestration and container platforms in the office.

A Lack of Intelligence

De-Multiplex and Communicate

Stephen’s Stance

You might also want to read these other posts...

Reader Interactions

Comments

Leave a Reply