I am spending a few weeks examining the truths and fictions that bind our industry together. Let’s start with one of my favorite old canards: That enterprise storage must be overpriced because bare disk drives are so cheap.
I have seen this straw man argument set up by so many throughout my career that it has become laughably predictable. Every time a new high or low point is set for enterprise storage cost, someone is there pointing out that a bunch of disk drives is vastly cheaper. Let’s call this the Dumb Disk Fallacy: Only a fool would claim that dumb disks are comparable to enterprise storage.
Are we supposed to be surprised that raw materials make up such a small percent of the cost of an integrated system? One can prepare a delicious meal at home using supermarket-bought ingredients for a quarter the cost of a restaurant outing. A single glass of wine at a bar often costs as much as a whole bottle at the store, yet the restaurant industry is holding on even through a recession.
Lack of expertise only one factor: Even one who can not whip up a soufflÃ© can certainly boil some spaghetti, warm up some sauce, and pour a glass of Chianti! Convenience is another driver, since many lack the time required to shop and cook. But there are other benefits as well: Dining out is a social activity and restaurant meals allow us to sample unfamiliar cuisine. Clearly, there are many reasons for one to pay far more for a finished product than for the ingredients it contains.
The same is true of enterprise storage. Although they make up a significant proportion of the bulk of a given storage array, hard disk drives are often a small element in the overall cost. One does not price a salad by its lettuce or a lasagne by its pasta, after all. Modern enterprise storage systems are defined their capabilities, ranging from performance to reliability to flexibility. Indeed, core components like disk drives and processors have long since been commoditized, and many vendors are leveraging much larger-scale commodity subsystems these days. The day will soon come when major vendors will differentiate entire product lines based solely on the software they run.
Now, it is possible to build very cheap storage systems, and the price of these can even approach the raw disk cost. But finished storage systems will never be as cheap as just a bunch of disks (JBOD, for newbies), because of the following:
- Disk arrays include lots of hardware components beyond disk drives – chassis, power supplies, controllers (complete with CPUs, RAM, etc), and cabling and connectors
- Data must be protected using extra disk capacity for parity, mirrors, snapshots, and spares
- Arrays include software to orchestrate the whole operation; even free software requires development, testing, and integration
- Support and maintenance contracts are required for any production system
- Supporting software is often a requirement, too, for configuration, operation, management, and integration with servers and applications
- The companies selling the array deserve a bit of profit or they’ll go out of business and your support contract won’t be worth much
- Don’t forget utilization: A half-empty disk costs twice as much as a full one, and it is awfully hard to make use of 100% of available capacity without a storage network of some kind
So how close to “dumb disk” can a production storage system get? Google’s is probably the closest: They use custom server boards, custom chassis, custom power supplies, custom racks, and low-end disk drives to keep hardware costs at a minimum; the entire thing is composed of commodity components, too; the array software is all written and supported in-house for very low overhead; the capacity is very highly utilized due to the inherent flexibility of the applications they support. So Google has attacked every one of these areas in an effort to drive out costs. But would it really be practical for folks other than Google to invent, construct, and run their own storage system?
Instead of dreaming about building our own storage solution, let’s look at the spectrum of storage available to the modern enterprise IT shop:
- Bare disk drives are widely available with a handful of companies producing just a few different models. The idiosyncracies of these are interesting to some, but a hard disk drive is useless alone
- Direct-attached storage offerings range from single-drive USB-connected external disks to multi-drive racks, yet these too are useless without a server to drive them. These normally lack any sort of intelligence or advanced features and quite a bit of effort is required to ensure acceptable levels of performance or reliability. But so-called JBOD or DAS storage can be incredibly cheap to purchase.
- True storage arrays range from the most basic home NAS boxes to the most advanced enterprise systems. All are purchased primarily for their features, with price tags merely differentiating between competitive offerings. For this reason, storage developers focus the vast majority of their engineering, sales, and marketing efforts on advanced capabilities rather than per-GB cost.
- Storage as a service is a different realm entirely. Although advances in manageability and utilization and commoditization of hardware allow today’s cloud storage offerings to be offered at very attractive per-GB price points, low cost is merely a welcome side benefit. Engineering and support resources focus almost entirely on enhancing user experience, while hardware is small component of the cost of delivering enterprise-class storage as a managed service.
I’m not an apologist for overpriced enterprise storage, but I recognize that arrays are more than just a collection of disks. I look forward to the day when we finally dispense with the dumb disk fallacy and focus instead on the real value added by enterprise storage innovation. But the realist in me knows that this straw man will continue raising his head for some time to come. I am sure that developments like EMC’s unified CLARiiON and Symmetrix hardware platform, the spread of software appliances on commodity server hardware, and excellent free storage software like Nexenta and FreeNAS will only add fuel to the fire. Therefore I entreat you, dear reader: Do not succumb and pitch disk that is dumb!
Interesting article, and you make some good points. I don't disagree with what you're saying, but your analogies fall down – the markup in restaurants and on wines and spirits in bars is absolutely HUGE 🙂
Interesting article, and you make some good points. I don’t disagree with what you’re saying, but your analogies fall down – the markup in restaurants and on wines and spirits in bars is absolutely HUGE 🙂
Hahaha yeah this is so true. I had a $19 Martini the other day. While it was indeed delicious, it probably contained just $2 of gin, vermouth, and lime. Oops I just gave out my secret Martini recipe!
Anyway, why would I spend $19 for $2 worth of booze? Because the total package was worth it: I was with my friends after all. That’s why we buy EMC, NetApp, and even Apple, isn’t it? The total package must be worth it or folks wouldn’t be buying…
Stephen, yes, you speak a truth. Great post, as usual.
But I suggest your argument could be interpreted as a bit of a straw man.
I haven't heard anyone who understands anything about the topic suggest that enterprise storage should cost the same as a raw disk drives. But that isn't the argument that the Nexentas of the world make.
Your analogy to restaurant prices is somewhat apropos. But a better analogy is to an adjacent market, servers. The question people ask, which I don't think can be glibly answered, is why markups from raw material cost are so much higher in the storage industry than they are in servers. I don't think there is any technology justification for this anomaly. I believe a forensic economist would conclude the structure of the industry, not the nature of the technology, fully explains this stark difference between the storage and server industries.
Another problem solved by a storage array is that, when well implemented, it can dramatically reduce the amount of effort to manage the storage environment. One of the most compelling arguments for this can be found in an interesting (though little known) pair of papers written by John Tyrrell which can be found here http://media.netapp.com/documents/Ditch_the_LUN_part_1.pdf. In it he states.
“The hundreds of studies done by the IBM Corporation in the 1980s showed that there was a one-to-one correspondence between the number of islands of storage to manage and the number of space failures, performance bottlenecks, job restarts/reruns, and the number of people to manage the storage.”
Well managed storage array implementations can significantly reduce the number of management tasks, and provides levarage points for policy automation tools. Given the intense focus on operational expenditure that Cloud services brings, the value of an enterprise storage framework (including the arrays, software and services), that drives down overall expenditure should not be underestimated.
Great post Stephen! There’s a corollary at the component level: “The premium for an enterprise disk drive is more than the sum of its hardware differences.”
Have you ever looked at what % disk drives – or other hardware components – make up of a storage system’s cost? I wonder if it has changed over time, or if it’s one of those fixed constants.
Might be different for each of your above classes of storage.
Great post Stephen! There's a corollary at the component level: “The premium for an enterprise disk drive is more than the sum of its hardware differences.”
Have you ever looked at what % disk drives – or other hardware components – make up of a storage system's cost? I wonder if it has changed over time, or if it's one of those fixed constants.
Might be different for each of your above classes of storage.