Multipath: Active/Passive, Dual Active, and Active/Active

Although it’s rare in the PC world, multipath I/O is not new in enterprise IT. I’ve been juggling paths to storage and networks as long as I’ve been a systems administrator, and that’s a bit longer than I care to admit. But the proliferation of technologies has made it difficult to understand path management. What’s the difference between “dual active” and “active/active”? Is “active/passive” really that bad?

What is Multipath? And Why?

Single path — The good old days: One device, one path

In the beginning, computers connected to peripherals and other computers through a single bus or channel and life was easy. Although one might mistake the names of the dominant printer connection (parallel) for some kind of multipath system when compared to the modem connection (serial), this was not the case. Only the bits traveled in parallel – the logical connection was a simple single path.

Daisy-chain 1 — Early-90's servers might confuse admins with two SCSI connections to a single device

Then things got complicated. The SCSI protocol allowed for multiple devices in a chain, and even for two different “initiators” (computers or controllers) to interact with these “targets”. Some folks even dual-attached devices to a single computer with multiple controllers.

Why would one device and one computer need more than one connection? It boils down to two factors:

Performance – I/O channels have typically been slower than the computer could handle, so multiple channels might be used to increase the amount of data that can flow in and out.
Reliability – If one connection failed, the other might still be usable, reducing the risk of an outage.

Multiple paths — Late-90's enterprise systems might have four or more paths to a single storage array

Pretty soon, enterprise computer architecture had gotten incredibly complex. I remember connecting a massive HP V-class server to an EMC Symmetrix with eight separate Fibre Channel cables. Each disk “LUN” showed up twice, and we had hundreds of them. We managed all of these virtual storage paths using HP’s PVLinks dynamic multipathing software. We used Veritas DMP and EMC PowerPath to do pretty much the same thing on Solaris and other UNIX systems.

Active/Passive to Active/Active

The earliest path management software provided two incredibly important functions: It figured out which of the SCSI targets it saw were actually different names for the same one, and it allowed the operating system to choose one and fail over to the other in case of an interruption. These were Active/Passive links – no matter how many paths were presented (and Fibre Channel switches sometimes presented eight or more), only one was active at any one time.

Switched Fabric — Modern systems have abstracted and virtual I/O channels, making path management much more important

But the EMC Symmetrix and similar high-end storage systems changed all this. Symmetrix storage was fully virtualized – the presentation of LUNs to servers was entirely disconnected from the actual disks and RAID sets in the array. This meant the Symmetrix could handle I/O requests across different paths and controllers for the same LUN. EMC and the rest responded with Active/Active path management software, allowing I/O to travel in parallel for the first time.

How is Dual Active Different?

Not everything called Active/Active is created equal. In fact, many supposed Active/Active setups really shouldn’t be called that since they don’t use both paths for all data. Instead, I like to call these Dual Active – both paths are active but with different data.

Consider the differences between the following two solutions:

Switched Fabric Active Active — A true active/active setup uses all paths for all data all the time

Switched Fabric Dual Active — A "dual active" setup uses both paths, but each target is directed to one or the other

See the difference? Although the paths are active in both cases, they are not the same. Both approaches have merit, and neither is inherently superior, but they should have different names applied. Even active/passive has its place, since simplicity is often a virtue.

Dual Active Outside Storage

These same concepts apply outside the field of storage and I/O. Many server clustering systems use the same terminology, right down to the misapplication of “active/active” when “dual active” is more appropriate. It’s easy to miss the significance of this difference, but it can make more of an impact in clustering since CPU workloads are harder to balance.

Let me know what you think. If there is interest, I might dive into path management strategies like round robin!

You might also want to read these other posts...

Comments

Marc Farley says

March 31, 2010 at 4:07 am

Stephen, this is a nice basic explanation of multipathing. I think a lot of people will find it useful.
Bas Raayman says

March 31, 2010 at 11:54 am

I fully agree with Marc, this is a great basic overview. Although I would also like to see an overview of of path strategies and behavior since this is obviously the next step in understanding what an MPIO setup does. Things like MRU,LRU, fixed path, round robin and the likes would be great. 🙂
penguingrl says

November 9, 2010 at 3:31 pm

Thanks Stephen, very helpful. At the end, though, you left me hanging! You say that Active/Active and Dual Active both have merit, neither is inherently superior. I’d love to know what the merits of each are, and when to choose each.
Phil White says

July 13, 2011 at 1:08 am

Just found out about this site and thought I’d leave a note about parallel-transfer, fault-tolerant storage systems that use a two-dimensional Reed-Solomon (2D-RS) error correction system. If you are interested in knowing more, I’ve published a number of pages on the web using Google Docs. You can reach them at http://www.tinyurl.com/ecctek .
john says

July 27, 2011 at 7:08 am

Excellent overview !!! Thanks
MK says

February 27, 2012 at 11:03 am

Good presenation, thanks
Hermann Huber says

August 15, 2014 at 6:32 am

Thanks a lot! A good overview!