October 25, 2014

Virtual Machine Mobility: Of What, and to Where and in What State?

Stepping out of a subway car is an entirely different matter when it's moving!

Mobility of virtual machines is a sticky wicket. As open systems infrastructure becomes increasingly virtualized, administrators and managers wish to use the technology to balance workload, ease migration, and provide better availability. Although technology is improving, actually moving virtual machines is not always a piece of cake. Let’s lay down a baseline of information so we may begin a discussion on the true nature of virtual machine mobility.

Mobility of What?

Let us consider first the question of what exactly is being moved. Systems administrators often focus on “the machine”, which encompasses the operating system and configured state of the virtual machine itself. But the true “mass” of the system is its stored data. Hypervisor vendors have come up with different techniques of moving these two essential elements, reflecting the unique characteristics of each.

  • The virtual machine is an instance of an operating system along with its state and configuration. Mobility of virtual machines requires all of this to be preserved, along with any I/O channels. Live migration of virtual machine requires that any active network sessions be maintained, along with RAM content, registers and buffers, and so many other elements.
  • The virtual machine image (commonly referred to as “storage”) is the static content addressed by a virtual machine. Typically a VMDK or similar virtual disc image, it must be accessible to the virtual machine at all times. Live migration of a virtual machine image is tricky, but perhaps not quite as complex as live migration of a running operating system.

VMware, Microsoft, and others recognize these 2 distinct elements to be migrated, and have come up with a variety of complementary technologies for each:

  • vMotion is VMware’s virtual machine migration solution, and has continually evolved with each iteration of the hypervisor. DRS leverages vMotion to automate mobility. VMware has also created Storage vMotion and Storage DRS as complements to handle mobility of virtual machine images.
  • Microsoft Hyper-V Live Migration is conceptually similar to vMotion, though newer and less full-featured. With Hyper-V 3.0, Microsoft will introduce Storage Live Migration as a complementary technology akin to Storage vMotion. Most other virtual machine managers also support some form of live migration, though live migration of storage is less common.

Mobility in What State?

One of the key benefits of virtual machine technology is the ability to “run anywhere” on dissimilar hardware. From the very beginning, hypervisors have provided the ability to create a universal virtual machine image that would run on a variety of supported platforms.

This leads to one of the key values of server virtualization in the data center: Disaster recovery. The ability to take a virtual machine image and system state and bring it online after a disaster is a true revolution for open systems IT. The benefits of the single usage of server virtualization technology easily justify the investment to many businesses.

But this sort of “cold” migration seems passé when compared to the live or “hot” migration possible with technologies like VMware vMotion. Live migration is much more difficult, since active client sessions must be preserved in activity must not be greatly interrupted.

This is the second great question that must be asked when considering virtual machine mobility: In what state will the virtual machine be moved? Will it be a cold, powered down image of the system? A suspended or paused operating system image? Or a full, running machine?

Mobility to Where?

Once we have decided whether we are discussing virtual machine migration or movement of storage resources, we must consider the scope of the movement. The ability to move a virtual machine from one member of the cluster to another has now become fairly common. But what about systems that are not related in a cluster? Or that are spread over great distances?

  • The nice thing about clusters is that they share resources before and after a virtual machine is moved. It is practical to move the running virtual machine, its storage, or both independently and to expect that performance will not dramatically suffer as a result. The cluster can also preserve network connections, and even I/O state, without much impact on clients or other external elements.
  • It is a bit more difficult to move systems within a data center, since one must maintain the I/O connections that might be interrupted. It is fairly trivial to configure an IP network and storage array to allow multiple machines to access the same iSCSI or NFS storage resources. It is a little more difficult to configure Fibre Channel (and, by extension, FCoE) SAN’s to handle this sort of dynamic movement, but it is not impossible. Although moving a running machine from one network port to another could cause client access to be interrupted, technologies like VXLAN allow these sessions to continue, and improved network switching technology should reduce performance impact.
  • Moving the machine to a different data center is another matter entirely. Stretching a layer-2 Ethernet LAN or Fibre Channel SAN across a metro or greater distance, while possible, will always be problematic. IP routing is flexible, but it takes time for changes to propagate when live machines are moved. And it is difficult to keep storage in sync over long distances due to the amount of time it takes for information to transit. Again, all of these challenges are being addressed in various ways, but they’re still hard!
“Shared-everything” clusters handle most of the mess of virtual machine mobility, regardless of storage protocols and the like. But not every virtual machine is in a cluster, even in the same datacenter. And not every movement is even within the same datacenter. So we still have work to do.

Stephen’s Stance

Moving cold virtual machine images from system to system, or even across great distances, is one of the main selling points of server virtualization. But it becomes much more difficult to manage movement of virtual machines that are still running, especially outside cluster or across WAN links. When talking about virtual machine mobility, it is important to consider what is being moved, the state it is in, and where it is going.

Note: This discussion is part of “Building Virtual Infrastructure”, my new seminar series with Truth in IT.