Multipath: Active/Passive, Dual Active, and Active/Active

March 30, 2010 By Stephen 10 Comments

Although it’s rare in the PC world, multipath I/O is not new in enterprise IT. I’ve been juggling paths to storage and networks as long as I’ve been a systems administrator, and that’s a bit longer than I care to admit. But the proliferation of technologies has made it difficult to understand path management. What’s the difference between “dual active” and “active/active”? Is “active/passive” really that bad?

What is Multipath? And Why?

Single path — The good old days: One device, one path

In the beginning, computers connected to peripherals and other computers through a single bus or channel and life was easy. Although one might mistake the names of the dominant printer connection (parallel) for some kind of multipath system when compared to the modem connection (serial), this was not the case. Only the bits traveled in parallel – the logical connection was a simple single path.

Daisy-chain 1 — Early-90's servers might confuse admins with two SCSI connections to a single device

Then things got complicated. The SCSI protocol allowed for multiple devices in a chain, and even for two different “initiators” (computers or controllers) to interact with these “targets”. Some folks even dual-attached devices to a single computer with multiple controllers.

Why would one device and one computer need more than one connection? It boils down to two factors:

Performance – I/O channels have typically been slower than the computer could handle, so multiple channels might be used to increase the amount of data that can flow in and out.
Reliability – If one connection failed, the other might still be usable, reducing the risk of an outage.

Multiple paths — Late-90's enterprise systems might have four or more paths to a single storage array

Pretty soon, enterprise computer architecture had gotten incredibly complex. I remember connecting a massive HP V-class server to an EMC Symmetrix with eight separate Fibre Channel cables. Each disk “LUN” showed up twice, and we had hundreds of them. We managed all of these virtual storage paths using HP’s PVLinks dynamic multipathing software. We used Veritas DMP and EMC PowerPath to do pretty much the same thing on Solaris and other UNIX systems.

Active/Passive to Active/Active

The earliest path management software provided two incredibly important functions: It figured out which of the SCSI targets it saw were actually different names for the same one, and it allowed the operating system to choose one and fail over to the other in case of an interruption. These were Active/Passive links – no matter how many paths were presented (and Fibre Channel switches sometimes presented eight or more), only one was active at any one time.

Switched Fabric — Modern systems have abstracted and virtual I/O channels, making path management much more important

But the EMC Symmetrix and similar high-end storage systems changed all this. Symmetrix storage was fully virtualized – the presentation of LUNs to servers was entirely disconnected from the actual disks and RAID sets in the array. This meant the Symmetrix could handle I/O requests across different paths and controllers for the same LUN. EMC and the rest responded with Active/Active path management software, allowing I/O to travel in parallel for the first time.

How is Dual Active Different?

Not everything called Active/Active is created equal. In fact, many supposed Active/Active setups really shouldn’t be called that since they don’t use both paths for all data. Instead, I like to call these Dual Active – both paths are active but with different data.

Consider the differences between the following two solutions:

Switched Fabric Active Active — A true active/active setup uses all paths for all data all the time

Switched Fabric Dual Active — A "dual active" setup uses both paths, but each target is directed to one or the other

See the difference? Although the paths are active in both cases, they are not the same. Both approaches have merit, and neither is inherently superior, but they should have different names applied. Even active/passive has its place, since simplicity is often a virtue.

Dual Active Outside Storage

These same concepts apply outside the field of storage and I/O. Many server clustering systems use the same terminology, right down to the misapplication of “active/active” when “dual active” is more appropriate. It’s easy to miss the significance of this difference, but it can make more of an impact in clustering since CPU workloads are harder to balance.

Let me know what you think. If there is interest, I might dive into path management strategies like round robin!

You might also want to read these other posts...

Comments

Marc Farley says

March 31, 2010 at 4:07 am

Stephen, this is a nice basic explanation of multipathing. I think a lot of people will find it useful.
Bas Raayman says

March 31, 2010 at 11:54 am

I fully agree with Marc, this is a great basic overview. Although I would also like to see an overview of of path strategies and behavior since this is obviously the next step in understanding what an MPIO setup does. Things like MRU,LRU, fixed path, round robin and the likes would be great. 🙂
penguingrl says

November 9, 2010 at 3:31 pm

Thanks Stephen, very helpful. At the end, though, you left me hanging! You say that Active/Active and Dual Active both have merit, neither is inherently superior. I’d love to know what the merits of each are, and when to choose each.
Phil White says

July 13, 2011 at 1:08 am

Just found out about this site and thought I’d leave a note about parallel-transfer, fault-tolerant storage systems that use a two-dimensional Reed-Solomon (2D-RS) error correction system. If you are interested in knowing more, I’ve published a number of pages on the web using Google Docs. You can reach them at http://www.tinyurl.com/ecctek .
john says

July 27, 2011 at 7:08 am

Excellent overview !!! Thanks
MK says

February 27, 2012 at 11:03 am

Good presenation, thanks
Hermann Huber says

August 15, 2014 at 6:32 am

Thanks a lot! A good overview!

GPS Time Rollover Failures Keep Happening (But They’re Almost Done)

This is week “1111111111” in the GPS system. Tomorrow morning it will roll over to week “0000000000”. How well will various systems handle this change? Not well, judging by what we’ve seen so far!

Ranting and Raving About the 2018 iPad Pro

I remain enthusiastic about the iPad Pro, despite getting a scratched screen and my concerns about durability. It’s a worthy successor to the original and offers enough improvements that I’d recommend the upgrade for just about anyone who uses their iPad for serious work. It’s still not yet a laptop replacement, but this is due more to a lack of desktop-class software for iOS than anything in Apple’s control.

EMC Redefine Possible (TL;DR Edition)

July 9, 2014

EMC made quite a few announcements today at their “Redefine Possible” event in London. There’s a lot of coverage out there already, so I decided to present a summary of the whole thing in “too long; didn’t read” (TL;DR) fashion.

ZFS Is the Best Filesystem (For Now…)

July 10, 2017

ZFS should have been great, but I kind of hate it: ZFS seems to be trapped in the past, before it was sidelined it as the cool storage project of choice; it’s inflexible; it lacks modern flash integration; and it’s not directly supported by most operating systems. But I put all my valuable data on ZFS because it simply offers the best level of data protection in a small office/home office (SOHO) environment. Here’s why.

Replacing Google Reader With Feedbin and Reeder

May 5, 2013

I am an avid Google Reader user, so I’m thoroughly annoyed by Google’s decision to kill it as of July 1. But there’s no stopping the tide, so I’ve made the move to Feedbin as a Reader replacement as of today. It’s a slick, snappy web application with a committed developer and, critically, support for Reeder, my favorite offline RSS reading application. Let’s hope this works!

FCoE vs. iSCSI – Making the Choice

May 20, 2011

iSCSI is an excellent choice in situations where Fibre Channel investment is nonexistent or badly in need of wholesale upgrade. FCoE, on the other hand, is likely to take over in high-end enterprise shops. It is relentlessly promoted by major vendors, and it seems that they will force the upgrade eventually.

How to Get Me to Write about Your Company or Product

March 15, 2012

Blog posts are really, really important for PR and awareness these days. And posts on independent sites like this one are critical to get awareness and search engine ranking. I get dozens of requests every week from companies wanting me to write about them. So I thought it would be a good idea to lay out exactly what it takes!

What’s (Still) Wrong With Dropbox For Business

April 17, 2013

I am a heavy (and paying) user of Dropbox, using it both for business and personal storage and synchronization. Although I find the service incredibly useful, Dropbox is far from perfect, especially for business users. So I thought I would take a few moments to talk about what I’d like to see Dropbox improve.

The 2018 iPad Pro is a Beast!

November 9, 2018

The third-generation iPad Pro is a great machine but also a bellwether of change at Apple. It will be very hard for the rest of the mobile and client computing industry to keep up with this kind of progress!

A Complete List of VMware VAAI Primitives

November 10, 2011

VMwareâ€™s introduced the â€œvStorage APIs for Array Integrationâ€ (VAAI) in vSphere 4.1, and block-heads like me went nuts. Weâ€™ve been trying to integrate storage and servers for decades, and VMwareâ€™s APIs finally allowed this to work in truly seamless fashion. But the world of VAAI is a thicket of bizarre naming and puzzling functionality. Some VAAI primitives are ignored or even hidden! Letâ€™s take a look at the complete list.