It seems like every other startup I talk to is pitching VMware-integrated caching, and a spate of acquisitions and announcements from flash companies are legitimizing the idea. Even VMware has gotten in on the game with vFlash Read Cache (vFRC). But integrating caching with VMware vSphere isn’t nearly as easy and effective as everyone is making it sound!
A Whole Datacenter Of Performance
No one can argue against the performance of a PCIe flash card in a VMware host. A typical PCIe card offers a whole datacenter of performance with almost no latency (IOPS in storage parlance). This is not an exaggeration: Every server in a mid-sized datacenter, added up, needs a few hundred thousand IOPS, a workload that nearly every PCIe storage device today can easily handle. Even the lowly SATA SSDs in my lab machines can handle 30k-40k IOPS without really working hard!
Flash memory gives us plenty of storage performance; the challenge is how to put this to use in a VMware environment. The easiest and most widespread approach to date is simply to add flash to existing shared storage arrays. Whether simply swapping SSDs for hard disk drives or truly integrating solid state storage into the array controller, hybrid and all-flash storage is here to stay. I can’t see anyone ever designing another all-disk storage array for general-purpose use.
But conventional storage protocols and networks drain the performance of solid-state storage, adding an order of magnitude more latency and reducing IOPS commensurately. Thanks to the latency added by Fibre Channel or Ethernet, the best of flash storage arrays struggle to offer a few hundred thousand IOPS of performance, yet this is child’s play for PCIe cards.
Some flash vendors punted simplistic solutions: Use a bare PCIe card to store everything! This works from a performance standpoint but is a non-starter for most customers. Non-shared storage breaks nearly every useful feature of vSphere, from HA to vMotion. Administrators are right to worry about reliability and availability, since a server failure takes down everything. And non-shared data isn’t mirrored, breaking a cardinal rule of enterprise storage.
VMware Guest Caching Challenges
The easy answer is to use software to dynamically leverage in-server PCIe flash, but this has proven fiendishly difficult to get right. One of the first companies out of the gate with VMware caching software was ioTurbine, now part of PCIe card monster Fusion-io. Their solution relied on a driver in the VMkernel, something VMware is loathe to propagate for fear of instability and bloat. Another kernel driver-based solution came from FlashSoft, which is now seeing massive uptake as part of flash giant SanDisk.
The next VMware caching solutions put to market had to work without a VMkernel-level driver. Most have resorted to drivers in guest virtual machines, either installing one everywhere or using a virtual storage appliance to present storage back to VMware.
Guest drivers function reasonably well but bring all sorts of undesirable baggage:
- Latency as data travels up and down the virtual I/O stack
- Installation and management headaches
- Concerns about guest OS compatibility
- Contention for resources that could otherwise be used to run productive virtual machines
For many environments, some combination of these concerns makes guest drivers a non-starter.
More Issues With Flash Caching
The same reliability and availability concerns that put the kibosh on single-server PCIe storage also affect smart PCIe caches. After all, if data is written to a single card, even just for caching, it could be lost in the event of a failure. No number of supercapacitors or batteries protect against a flash or firmware failure.
For this reason, most VM caching solutions are read-only. By writing data to a conventional storage array (“write-through” in industry-speak), administrators feel more confident that the system will still function in the event of a failure. This also allows VM mobility for simple caches, since there’s no worry about inconsistent data between servers. Implementing a true “write-back” cache requires replication of data between cards and servers, a non-trivial task!
There are more issues with VMware-integrated caching besides those I’ve mentioned. Many solutions (including VMware’s new vFRC) statically partition flash resources, reducing the inherent dynamic nature of virtual infrastructure. There are also concerns about data alignment and write magnification, reducing performance and longevity of flash. And the overhead of all that I/O will always contend with guest virtual machines for valuable CPU and I/O resources.
Stephen’s Stance
Integrating solid state storage as a VMware cache isn’t a trivial task. In fact, it’s become the core challenge for some of the best minds in storage, and few real answers have yet emerged. This will be a primary area of focus for me and others who watch and comment on virtualization and enterprise storage!
Many companies are attacking these issues, and I’ve been pleased to have in-depth conversations with many at my Tech Field Day events. For example, PernixData wowed the VMware world when they unveiled their flash virtualization platform at Storage Field Day 3. Infinio’s NFS Accelerator (introduced at Tech Field Day 9) is a virtual appliance that uses host RAM as a distributed cache for NFS access. Proximal Data will present at Storage Field Day 4 in November. Marvell’s DragonFly moves the cache to the HBA, as they showed at SFD3. And SanDisk’s FlashSoft continues to impress, as seen at their SFD3 presentation.
I welcome companies in this space to leave a brief summary of their VMware caching solutions in the comments below!
Keith Mayer says
Great article, Stephen, on current considerations and challenges on leveraging PCIe as a direct storage solution for enterprise virtualization platforms. The points that you’ve noted in this article have been echoed by several of the organizations I work with – particularly around HA configurations and complexity of replicated write-back caching. For these reasons, I’ve been seeing organizations moving towards evaluation of the 12Gbps SAS HBAs and SSDs that are just starting to come on the scene with some really impressive results ( not quite as good as PCIe, of course, but still really good ).
Have you had an opportunity to consider 12Gbps SAS as a practical interim solution for enterprise storage while the industry continues to work on evolving enterprise support around PCIe storage? Interested to hear your thoughts …
Dan Pancamo says
What about Vmware’s new vFRC? Sure looks like a huge winner over all these 3rd party solutions.
Daryl Saunders says
Dell Compellent FluidCache for SAN addresses these concerns and is now released for sale for the VMware platform. It uses a virtual appliance with PCI pass through to control the PCIe SSDs and a network controller connected to a super low latency 10 or 40Gb network that supports RoCE, giving it the ability to provide FULL Read AND WRITE caching with full data integrity.
Read up on it, this is going to be a fun ride.