Although today’s enterprise storage arrays target “the fat middle” of the market, virtualized and distributed storage change everything. What form does enterprise storage take in the new data center? Think top-of-rack flash, bottom-of-rack capacity, and a whole world of new interconnects!
Here’s the Rack Endgame series:
Two Kinds of Storage
Enterprise storage has two jobs to do: Long-term capacity and short-term retention. In other words, applications don’t treat primary storage resources the same, but we’ve historically used a single solution for both performance and capacity. As I discussed in “The Fat Middle: Today’s Enterprise Storage Array“, this was necessary because we lacked any mechanism to identify and move storage in the heterogeneous data centers of yesterday.
But things are changing. As I talked about in “Virtualized and Distributed Storage: This Time For Sure!“, virtualization and distribution is coming to the enterprise in the form of VMware VSAN, converged solutions from companies like Nutanix, and caching solutions like PernixData. And, as Dave McCrory points out with his “data physics“, storage has its own gravity that bends the shape of both physical infrastructure and application architectures.
The endgame of these shifts is the development of two flavors of enterprise storage, each with its own unique characteristics:
- Capacity needs will be met by new storage systems optimized for scalability above all else. Take a look at X-IO, SwiftStack, and Exablox – notice any similarity in the job they serve? They’re all about capacity, ease of use, and scale.
- Performance needs will be served by specialist flash on high-performance busses. Imagine a “top of rack” flash shelf like EMC DSSD or even in-server memory channel storage from SanDisk/Diablo. That’s a new world of performance!
Future enterprise data center architects will be able to realize dramatic reductions in cost for storage capacity as well as unheard-of performance by focusing on the edges rather than the “fat middle” of the market. And virtual server architecture will make this possible for the first time.
Top of Rack, Bottom of Rack
Walk into a datacenter in 5 years and things might look awfully different. Rather than a big, mysterious storage array at the physical and virtual center of the datacenter, you’ll see uniform racks of servers, each with its own storage. Sure, distributed storage could live within the server itself. But I think there’s another more-likely setup.
If servers are generic components to run software, why should storage be any different? Why not optimize servers for compute and memory and keep storage outside? Although it’s possible to pack disk drives inside converged servers, I’m not sure this is such a great idea. There’s no reason to keep low-performance capacity this close to the CPU, and disks waste space, power, and cooling that could be better used for CPUs and memory.
So why not stick to an external approach for storage by moving capacity to a “bottom-of-rack” pool? We could use inexpensive SAS or Ethernet to connect each server in a rack without any impact on performance. We could even start using simpler object storage rather than sticking with the outdated SCSI “fake disk” approach. But the protocol really doesn’t matter since we’re immediately abstracting everything in software anyway. It could even be FC, but that seems a bit of a waste of good tech with no SAN!
Then there’s performance storage. Advancements in PCI Express technology mean we can place high-throughput/low-latency flash memory just outside the server using an external PCIe bus. Keeping PCIe flash outside the server allows us to maintain something of a plug-and-play architecture for servers, since they could be swapped out without dramatically impacting the rest of the distributed storage network. But data would be “close” enough to offer orders-of-magnitude better performance.
I don’t hate the idea of keeping storage in-server, but this rack-oriented approach is a reasonable alternative. Sure, memory channel flash would be faster, but it could exist inside the top-of-rack array, right? And Nutanix is making hay with internal disk, but I doubt they’d be too upset to move disks just outside the box. It’s all good.
Stephen’s Stance
Top-of-rack flash and bottom-of-rack disk makes a ton of sense in a world of virtualized, distributed storage. It fits with enterprise paradigms yet delivers real architectural change that could “move the needle” in a way that no centralized shared storage system ever will. SAN and NAS aren’t going away immediately, but this new storage architecture will be an attractive next-generation direction!
Note: These are topics I discuss in my public speaking engagements, including my Truth in Storage seminar with Truth in IT. Check out the schedule for Truth in Storage and come listen to the whole story!
Disclaimer: I work with Diablo Technologies, X-IO, SanDisk, PernixData, Nutanix, Exablox, EMC, and most other companies in enterprise storage with Foskett Services and Tech Field Day. I don’t think I’m biased, but you can draw your own conclusions.
Mario Lenz says
The “disaggregation” concept Intel presented on the OCP summit 2013(?) is heading in a similar direction. However, I think they’re approaching this from the opposite direction. You’re suggesting to bring storage closer to servers, they’re suggesting to get rid of local storage. But you both come to similar solutions imho.
Noam Shendar says
Stephen, your last 3 posts about the end of traditional SAN/NAS have been thought provoking. I have been pondering them and would like to offer a few points of support and elaboration. We agree about the pitfalls of convergence and, in particular, that shared storage is needed to keep servers stateless (and hence easily replaceable).
Now, what can slow down the realization of your vision is lack of backward compatibility (i.e., the need for new software to take advantage of the benefits you describe). This is why we have found a new use for virtualization. We created a single managed system under which there are many custom-configured (and elastic) virtual arrays, each optimized for capacity, performance, or any point in between. We call these VPSA (Virtual Private Storage Arrays), where each VPSA is completely isolated and self-contained SAN/NAS device, configured and modified in real time. This means one still gets the benefits of shared storage, but no longer of the one-size-fits-all variety.
And we recently announced an iSER interconnect option (in addition to standard Ethernet). I mention this because iSER is both cost effective and can achieve InfiniBand-like performance for the top-of-rack portion of the vision you describe.