Way back in the 1990’s, UNIX admins delighted in upgrading from NFSv2 to NFSv3. Then NFSv4 came around and … crickets. Now VMware has become the first major/useful/mainstream application for NFSv4.1, so the floodgates are open! But are they?
A Little History
A long, long time ago, when Sun was the dominant server company and x86 was still a desktop CPU, computers needed to share files with each other over the network. Sun, being the dominant server company, created a networked file system called, surprisingly, Network File System or NFS for short. Like most things, it took a few revisions to take hold, but it eventually did with version 3. NFSv3 became so popular, in fact, that Big Daddy Sun “donated” it to the world as a standard. RFC-1813 never did become a real standard, but it was good enough to dominate network-attached storage for decades.
NFSv3 was created in the dark ages, when Microsoft hadn’t yet noticed the Internet and Windows didn’t even have a native IP stack. Seriously! Redmond was still trying to send files back and forth over NetBIOS before switching to something called CIFS, and if you don’t know what that is you should count yourself lucky and just move on along with this history lesson.
NFSv3 was a serious upgrade and a serious protocol, and UNIX nerds like me loved it. Well, as much as anyone loves a storage protocol. But it was never intended for the heavy use it has taken on in the 20 years (!) since it was introduced. For one thing, NFSv3 is stateless, making it difficult to maintain locks on shared data. It also has a rather naive and promiscuous approach to TCP port usage.
But the biggest issue with NFSv3 is that clients will only talk to one server at a time. This is seriously out of touch with today’s Internet-y scale-out world! This is especially true in high-I/O environments. You know, like VMware vSphere. Sadly, vSphere users have adopted NFSv3 like mad owing to the insanity of using block storage protocols that are even older and less flexible.
The storage industry responded with a much-improved protocol: NFSv4. It has a concept of state, making locking easier to implement, and is more modern in the way it handles TCP ports. It also has a really nifty “pseudo filesystem” capability, allowing each client to see just what it needs to see. Plus, NFSv4 was ratified as an Internet standard, so everyone can coexist and get along and hold hands!
Or not. See, NFSv4 has been with us since 2003 and hasn’t really been adopted by anyone doing actual stuff. Sure, it exists as a protocol, but end-user enthusiasm has been pretty much nonexistent in the datacenter.
And there’s been a bit of criticism, too:
NFSv4 is not on our roadmap. It is a ridiculous bloated protocol which they keep adding crap to. In about a decade the people who actually start auditing it are going to see all the mistakes that it hides.
The design process followed by the NFSv4 team members matches the methodology taken by the IPV6 people. (As in, once a mistake is made, and 4 people are running the test code, it is a fact on the ground and cannot be changed again). The result is an unrefined piece of trash.
NFS 4.1 and Parallel NFS!
NFSv4 didn’t address the inherent limits of a one-to-one protocol. No less a luminary than Garth Gibson wrote at length about the problems of NFSv4. So the industry responded by trying to develop 100 novel ways to parallelize I/O, mainly in the form of completely different shared file systems.
One of the better ideas was the parallel NFS (pNFS) concept from Panasas, Garth Gibson’s company. Like SDN over in the networking world, pNFS separates the “what” from the “how” of storage: A metadata server handles information about the files and directories and such separately from the actual data access. In practice, this allows clients to access data over multiple streams at once, yet still preserves backwards compatibility for non-parallel access.
pNFS became part of NFSv4.1, and the entire industry was transformed. Or not. Once again, end users responded with a yawn and, half a decade on, NFSv4.1 has almost no mainstream uptake. Very few storage devices support NFSv4.1 or the pNFS extensions anyway, so it’s not like they’re missing anything.
In fact, Microsoft’s competing SMB 3.0 protocol has much wider adoption even though it’s a relative whippersnapper, being introduced in 2012. No doubt much of the driving force behind SMB 3.0 is the fact that Microsoft controls the client (Hyper-V and Windows Server) as well as the “storage array” (Windows Server again), ensuring compatibility and supportability. No one in the NFS space can claim this kind of end-to-end support.
VMware vSphere 6 to the Rescue?
So now along comes VMware to the rescue, finally giving NFSv4.1 the client demand it has always needed. vSphere loves I/O and vSphere loves NFS, so this is a match made in heaven! Consummating this relationship, however, might take a bit longer…
First, VMware’s support for NFSv4.1 remains pretty iffy. It works with the basics (HA, DRS, vMotion) but not the more advanced features (Storage DRS, SRM, VVOLs). And of course it’s brand spanking new, so caveat user. Storage technologies usually have a bit of teething to do when they’re first launched, so I would expect a few bugs to crop up in short order.
But there are bigger issues. Although vSphere 6 includes NFSv4.1 support, it does not include pNFS! It does do multipathing or trunking (as Hans eloquently illustrates) but not parallel NFS. And you can’t mix NFSv3 and NFSv4.1 access so migration is problematic (as Chris discusses).
Then there’s the issue of array-side support. Even though NFSv4.1 has been around for a few years, storage array vendors have put it on the back burner so they can focus on things customers actually want. EMC claims NFSv4.1 support in the VNX, but who’s to say how “prime time” it is. And most other vendors are way behind in implementing it because people like me have been telling them for years not to bother.
Read more about vSphere 6.0 and NFSv4.1:
- An Overview of NFSv4 by SNIA
- VMware Embraces NFS 4.1, Supports Multipathing and Kerberos Authentication by Chris Wahl
- What’s New in vSphere 6.0: NFS Client by Julian Wood
- vSphere 6 NFS4.1 does not include parallel striping! by Hans De Leenheer
- vSphere 6.0 Storage Features Part 1: NFS v4.1 by Cormac Hogan
Storage is hard, and major changes need time before all the kinks are worked out. I applaud VMware for implementing NFSv4.1 since it checks a few of the empty boxes created by Microsoft’s excellent SMB 3.0. But it’s not time to celebrate yet.
Production use of NFSv4.1 in VMware vSphere environments is likely a year or two off, at least for people with half a brain. Unless your vendor sells a supported package and guarantees stability, it’s best to let it simmer a bit longer before taking the plunge. And enterprise environments are going to want to wait until the high-end vSphere features work on NFSv4.1, too.
Bonus snark! Maybe VMware shouldn’t have bothered with NFSv4.1 and should instead have implemented a modern, scalable, high-performance storage protocol like SMB 3! I’m sure Microsoft would welcome a new client…
Note: I am keenly aware that many people I respect spent years of their lives developing NFS and I didn’t do anything but snark at them. I really do feel bad about that.
Updated 2/4 to clarify the NFSv4.1/pNFS issue and removed the “one datastore” limit.