Everyone wants to be the best, so outrageous claims of supremacy are as old as time. In IT, these claims often revolve around synthetic benchmarks chosen to highlight a system’s performance. Buyers have grown wary of these claims, smartly asking to try before they buy. But predictability is even more important than real-world testing, and this is particularly difficult for storage systems to achieve.
Jingle Bells and Benchmarks
Have you ever listened to the lyrics of the classic holiday song, “Jingle Bells”? It’s a drag racing song from back in the horse-and-sled days. The subject, a young man, wipes out with his girl and gets laughed at by his rivals. But the narrator has some advice: “Just get a bobtailed bay/two forty as his speed/hitch him to an open sleigh/and crack! you’ll take the lead.” Get a faster horse and a lighter sled to impress the ladies!
Just like that young man, every storage company wants to take the performance lead against his rivals. And just like him, they will spare no expense to rig the contest. The easiest way to brag with benchmarks is to match it to the particular quirks of your system, pick the best result, and only report that. And make sure the system is empty and features like deduplication are turned off. Who cares if “your mileage may vary”?
I’ve been guilty of cheering for top-speed benchmark results in the past, happy to report “million IOPS” claims and raw throughput numbers. But I’ve always tried to emphasize that these top-speed numbers only show one aspect of performance. Run a real workload against a system and you’ll usually get a more realistic idea of system performance.
Predictable Performance
For the last few years, hybrid and all-flash arrays have been the bobtailed bays bedeviling the old nags of the storage industry. And some of them are pretty fast indeed, even with real-world applications running!
But these systems often have serious issues maintaining top performance over time. Many slow down as the capacity fills up, and processes like garbage collection, rebalancing, and deduplication can cause serious intermittent performance hiccups. Then there’s the crunch caused by resource contention in scale-up and clustered systems. It’s really tough to know how a system will perform over time!
The same issues affected disk-based systems, but those had less potential performance so developers added less conflicting demands. Most were designed to wring out maximum IOPS from the (slow) disks so activities like rebalancing and deduplication were relegated to occasional/post-process status. The difference is that systems flush with flash are increasingly putting all this in-line and leaving it on all the time.
That’s why I’m more impressed by systems that can guarantee a predictable quality of service than those that boast maximum IOPS. Companies like SolidFire offer performance guarantees even as the system scales out, and this more than offsets their somewhat modest claims. Reborn NexGen Storage is another company with serious quality of service credentials – it was core to their original design! And consistent performance is a big part of Pure Storage’s all-flash pitch as well.
Flash devices have issues with performance consistency, too. SSDs have their own inline data massaging techniques, and most have garbage collection or trim processes that can ruin performance intermittently. And once you get into the world of ultra-high performance, PCIe bus and chipset contention can come into play, according to Diablo Systems.
Stephen’s Stance
We can’t accept benchmarks at face value, but we must also be careful when constructing real-world performance tests. Fill up the array and let it run for a while and you might see performance-sapping processes interrupting your baseline performance plateau. Just another thing to look for in storage!
Disclaimer: Just about every storage company has done business with me as part of Tech Field Day, and Foskett Services has been involved in custom projects with NexGen, Fusion-io, and Diablo. Plus SolidFire gave me great beer. None of this had any influence in my writing this article, however.
Miroslav Klivansky says
Totally agree! To make testing AFA’s simpler I hacked together an automated testing toolkit around the vdbench load generator. It deploys as a command VM and worker VMs. There’s a preconditioning phase that fills the array and ages the LUN contents, a number of profiling runs to build latency curves at different IO sizes, and then a 12 hour steady-state test to see how the array performs under continuous load. It’s mostly automated once you deploy the VM’s from OVA and map the test LUNs as RDMs. Totally open source, so it’s easy to see there are no tricks. If you know vdbench, it’s also pretty easy to modify for additional tests.
For anybody who’s interested in learning more, check out the downloads and how-to info at: https://community.emc.com/docs/DOC-35014