July 30, 2014

The I/O Blender Part 3: Behold the Power of the Demultiplexer

The I/O Blender Series

Virtualization has disrupted the I/O path, reducing the value of enterprise storage arrays. But all is not lost: An effort is afoot to make things right by increasing communication between hypervisor and array and demultiplexing data before it is stored.

VMware vVol is a necessary step to promote virtual disks to "full citizen" status

“What we have here is a failure to communicate”

The core problem with the I/O blender is that virtual machine disk images aren’t communicated or managed at the same level as “real” LUNs. This isn’t a surprise: They didn’t exist when today’s dominant storage protocols and arrays were designed!

Block storage protocols just aren’t capable of passing sufficient information between the hypervisor and the storage array. In the old days, this was not a problem because the storage array could assume that any I/O on a given port, and any data stored on a given LUN, was from a single server. But this is no longer the case when the hypervisor maintains its own LUN presentation.

VMware and the major enterprise storage players had great success augmenting this communication channel with VAAI. This API allows the hypervisor to instruct the storage array to lock, zero out, or make a copy of a set of data independent of the LUN. In practice, VAAI helps greatly to alleviate the pressure on storage arrays.

But the array still has insufficient information effectively to cache storage access. And VAAI integration does not make a virtual machine disk a “full citizen” when it comes to data management and manipulation in the array.

Emancipation of Virtual Machine Disks

VMware introduced a new idea at VMworld 2011 that would change everything about the I/O blender. Although not an official product, the so-called vVol concept would add a communication channel to allow storage arrays to de-multiplex I/O. In the vVol world, virtual machine disks can be treated just the same as conventional LUNs.

Note: vVol doesn’t have an official name yet. Hopefully it won’t be “VMware API Program for I/O Demux (VAPID)” though!

Contrary to the hysteria of last summer, it does not appear that VMware is introducing their own totally new storage protocol. Instead, vVol would be an out of band communication channel somewhat like VAAI, while conventional I/O operations would continue to use a protocol like SCSI or NFS.

Can Your Array Handle vVol?

But this does not mean that any old storage array can support vVol. On the contrary, this promises to be a very tricky implementation indeed. The storage array must not only support the vVol API or command set, it must also have the horsepower to demultiplex I/O and reassemble virtual machine disk images. Then, it must have sufficient capability to effectively manage these disk images internally.

One can imagine two alternatives for implementing vVol inside a storage array:

  1. The array can demultiplex on ingestion, storing virtual machine disks on the backend as plain old LUNs or files
  2. Or the array can maintain a mapping table for multiplexed data, adding a new layer of abstraction

Each array model will likely lend itself to one or the other solution, and storage vendors will have to choose the right mechanism for their device. But neither choice is easy, and both will require additional memory and CPU resources as well as a great deal of programming effort.

It may appear that NFS does not need vVol, but this is not the case. vVol is about more than just demultiplexing block storage access: It is about elevating the virtual machine disk image to “full citizen” status. If NFS didn’t need vVol, the entire industry wouldn’t need vVol; they could just adopt NFS!

Stephen’s Stance

It is not clear whether VMware will introduce vVol officially as part of the (presumably) forthcoming vSphere 6, and it will likely be a year or more before we see more than a few production-ready arrays. But addressing the issue of the I/O blender is too important to be ignored, and VMware should be commended for tackling the issue head-on.

  • Dilip Naik

    Interesting. But I see two distinct trends here. One is a cluster of high end servers, each hosting 10s (if not) 100s of VMs, connected to storage arrays that are “VM aware” and do things like off loaded data copy, zeroing, etc. And more….VMware doing this will help sell EMC arrays – I am not saying its bad for the IT datacenter …

    The other is the sub $1000 laptop now running at least 1 or 2 VMs – albeit more for geeks & for corps running XP mode VMs. But with phone vendors discussing running a hypervisor in the phone, can the days of VMs on laptops be far away?

  • Dilip Naik

    Interesting. But I see two distinct trends here. One is a cluster of high end servers, each hosting 10s (if not) 100s of VMs, connected to storage arrays that are “VM aware” and do things like off loaded data copy, zeroing, etc. And more….VMware doing this will help sell EMC arrays – I am not saying its bad for the IT datacenter …

    The other is the sub $1000 laptop now running at least 1 or 2 VMs – albeit more for geeks & for corps running XP mode VMs. But with phone vendors discussing running a hypervisor in the phone, can the days of VMs on laptops be far away?

  • Ed Lee

    Great post, Stephen! Looking at the recent storage features released or planned to be released by VMware, they clearly see this as a big problem as well.

    I enjoyed the post so much that I put together a follow on blog (shameless plug):
    http://www.tintri.com/blog/2012/06/virtualization-can-be-kryptonite-for-storage-admins/