I’ve long been critical of poorly-executed performance comparisons and the “fastest is always best” mentality behind them. Who really cares if a Honda minivan accelerates quicker than a Toyota when no real-world owner will ever keep the accelerator floored from stop to highway speed? The same goes for enterprise gear: Is an over-subscribed backplane really a problem when most network switches hum along at 20% load? But, although it sounds inconsistent, I still love reading the performance “comparos” in Car & Driver, and I am committed to the belief that the enterprise IT world needs lab tests and performance comparisons.
Is Maximum Performance Relevant?
My opinion on maximum performance might seem clear from the paragraph above, but it’s more nuanced than that. Although maximum performance is not singularly important, it is often an indicator for more relevant metrics. For example, a car that lags behind all others in absolute acceleration or top speed might be similarly unable to deliver satisfying performance driving around town. And, all other elements being equal, the quicker car may have better engineering.
Consider the now-infamous Tolly report comparing the HP c7000 blade system with Cisco’s UCS. Many complained that the test was unfair to Cisco, and I noted that they cherry-picked favorable results. Yet this report did start a discussion on HP’s blade products, oversubscription, Ethernet and FCoE versus Virtual Connect and Flex10, and the merits of blade personality. Although the performance test wasn’t the smack-down that HP seems to have wanted, they would probably judge the report to be a success.
Microsoft recently demonstrated that their software iSCSI initiator (in combination with Intel’s Xeon 5500 and 10 GbE adapters) can achieve wire-speed throughput and one million IOPS. This was a particularly wise benchmark even though it neither demonstrated a real-world use case nor directly compared to the performance of competing protocols. No, the report was news-worthy because it demonstrated a level of performance that defied conventional wisdom. The Microsoft/Intel iSCSI test was analogous to Nissan’s record-setting lap of the NÃ¼rburgring Nordschleife in their GT-R: It set the world on notice that they were a serious contender.
So maximum-performance tests can be useful to get the world talking and challenge the status quo. They can also demonstrate innate technical superiority, though one has to investigate such claims fully to see whether they are being made fairly.
Although maximum performance should always be taken with a grain of salt, comparisons can take many other factors into consideration. The real value of performance comparisons comes when an attempt is made to model real-world usage. Holistic evaluation, taking both objective and subjective metrics into account, can help buyers separate the wheat from the chaff.
I very much respect the spirit behind “real world” benchmarks like SPC Benchmark-1. The creators attempted to reflect actual enterprise workloads for storage systems, including email servers, databases, and OLTP. Although many criticize the exact specifications or application of these tests, I applaud that they are rooted in what end users actually do with storage.
I was similarly impressed by the Data Center Infrastructure Group’s new Midrange Array Buyer’s Guide. Jerome Wendt and company assessed every storage system across a slice of the market and laid out the facts in an easy-to-understand format. My initial examination of the results was reassuring: The Guide passed my “sniff test”, with systems I know to be good near the top. One can argue the merits of each system’s placement, but I am certain that end users will be able to use this document to create “short lists” of solid products to evaluate.
Then there are people like Howard Marks at DeepStorage, Dennis Martin at Demartek, and the folks at ESG Labs. Each is doing a yeoman’s job of trying to generate real-world use cases and comparisons. This is what benchmarking and testing should be all about: Helping people make sense of the confusing array of products on the market. I applaud them for turning hands-on time into suggestions for improvement, guides for usage, and fodder for comparison. No one will run out and buy “that specific device” based on a benchmark, but they might open their eyes and consider “these few.” That sounds like a win to me.