Musing: Could We Replace Ethernet With PCIe?

February 9, 2015 By Stephen 3 Comments

Greg “EtherealMind” Ferro recently “mused” that it might be a good idea to replace PCI Express (PCIe) inside servers or rack-scale infrastructure with Ethernet. But this seems to be the exact opposite of the direction the industry is headed. Rather than replacing PCIe with Ethernet, companies like Intel seem set on replacing short-range Ethernet (in rack-scale systems) with PCIe!

PCIe vs. Ethernet

Greg points out (rightly) that electrical signals copper traces on motherboards are currently limited to 15.75 Gbps in PCIe 4.0. With 16 lanes, this brings us to 252 Gbps of throughput on the PCIe 4.0 bus. Greg is also correct that current Ethernet switches operating at 25 Gb can handle this kind of throughput across 10 or so connections. QED, right?

Read Greg’s post: Musing: Could We Replace PCIe Bus With Ethernet Switch?

Sorry, Greg! Stuffing an Ethernet switch into a server is exactly the wrong direction for many reasons.

Most pressing is the issue of latency. PCIe latency is measured in hundreds of nanoseconds, while Ethernet interconnects are measured in tens of microseconds. This might not sound like much, but it’s literally two orders of magnitude difference and would be a huge step back in real-world use.

Just because you can push the same amount of data across a link (throughput) doesn’t mean you can do the same tasks. PCIe is like a fleet of shopping carts filling the aisles at your local Costco, while Ethernet is the street of SUVs taking those big boxes of cereal and lightbulbs back home. Although they are theoretically carrying the same payload, Explorers and Caravans just weren’t designed to navigate inside the store!

There are many other issues to consider as well. Ethernet NICs and switches are complex, being designed to handle the vagaries of topology changes, speed differences, and relatively frequent reconfiguration. An in-server Ethernet variant could be stripped down to the basics and integrated into the chipset just like PCIe but this would obviate the external connectivity benefits suggested by Greg. So every device would have to be a full-featured Ethernet endpoint, likely with TCP/IP besides!

Rack-Scale Computing, OPCIe, and SiPh

Greg mentions Intel’s work on rack-scale computing and silicon photonics. Good! But then he suggests running Ethernet over this lovely next-generation interface. Bad!

The intent is to run PCIe over all those integrated silicon/optical interconnects and extend it to rack-scale, rather than ingesting Ethernet. This has a whole raft of benefits, including better real-world performance (thanks to low latency and little protocol overhead) and easier integration, since PCIe is already in use at all points in a rack-scale infrastructure.

IT folks usually express some serious skepticism when I mention PCIe as an externally-exposed interconnect. But then I point out that this entire system is already in use! Apple’s Thunderbolt is simply PCIe over copper DisplayPort cables and long-range optical cables are on sale today. Intel’s silicon photonics (SiPh) optical PCIe (OPCIe) technology has been sampling for over a year now, and Fujitsu has demonstrated a server using these optical interconnects for peripheral interconnection.

Proponents of rack-scale computing seem poised to adopt OPCIe as an interconnect within the rack in the next year or so. This will encroach on the market for current server-to-server and server-to-storage interconnects like Ethernet, Fibre Channel, and InfiniBand. Enterprise products based on OPCIe are being developed as well, though few if any have yet been announced.

You might like to read my Rack Endgame series:

The Fat Middle: Today’s Enterprise Storage Array

Virtualized and Distributed Storage: This Time For Sure!

The Rack Endgame: A New Storage Architecture For the Data Center

Note that Fibre Channel, InfiniBand, RapidIO, and many other technologies besides have attempted to do just what Greg is suggesting: Unify internal and external connectivity with a “master” protocol. But none have succeeded. It seems more logical to standardize on a fast, scalable, low-latency bus like PCIe for short-range communication and a ubiquitous network like Ethernet for longer-range use.

Stephen’s Stance

Rather than pushing Ethernet into the server, the industry is pushing it out of the rack. Soon, racks will function like blade chassis, with high-speed interconnects for internal communication and Ethernet termination points for communication outside the rack. Probably the closest thing to reality in Greg’s vision is the concept of tunneling Ethernet over PCIe and integrating it into server chipsets. This would function something like FCoE, providing a path for a legacy interconnect (Ethernet) right into the heart of the new converged rack.

Note that the title of this piece if farcical, and based on Greg’s title. No, we cannot replace all of Ethernet with PCIe. It’s still the king of the campus and larger-radius networks. But it has no place in the world of PCIe!

You might also want to read these other posts...

Comments

Etherealmind says

February 9, 2015 at 7:21 pm

Good response!!

I’ve done some more research and it seems that at least three different business units inside Intel are actually pursuing Ethernet, OPCIe and Infiniband products around this idea.

Personally, I don’t see that PCIe or IB will win. The protocol stacks that drive them are too niche to survive over time. I’ve learned to never bet against Ethernet. Over time it kills all other link layer protocols. Token Ring, ATM, Frame Relay, FibreChannel are just a few of the casualties.
sfoskett says

February 9, 2015 at 7:24 pm

I can see niche-ness killing IB, but PCIe is everywhere. It’s in everything. It’s not niche at all! Ethernet and PCIe are the dynamic duo of ubiquitous interconnects these days…
Geoff Arnold says

February 9, 2015 at 9:42 pm

Help me out here. PCIe performs address allocation by enumeration, doesn’t it? So slot numbering is potentially different every time a new device is added. That may be OK for a host-based PCIe driver in the server OS, but how the hell does it work with ACS-enabled endpoint-to-endpoint transfers?

One reason for looking at Ethernet (or a derivative) rather than PCIe is to get away from the host-centric SPOF….

Ranting and Raving About the 2018 iPad Pro

I remain enthusiastic about the iPad Pro, despite getting a scratched screen and my concerns about durability. It’s a worthy successor to the original and offers enough improvements that I’d recommend the upgrade for just about anyone who uses their iPad for serious work. It’s still not yet a laptop replacement, but this is due more to a lack of desktop-class software for iOS than anything in Apple’s control.

The 2018 iPad Pro is a Beast!

The third-generation iPad Pro is a great machine but also a bellwether of change at Apple. It will be very hard for the rest of the mobile and client computing industry to keep up with this kind of progress!

What You See and What You Get When You Follow Me

May 28, 2019

Social media ought to be social, not just a broadcast platform. That’s my feeling at least. It’s been a while since I’ve ranted about “write-only” social media accounts, so I thought I might as well do it again. And at the same time, I thought I would update you on my promise to the people who read, follow, and interact with me online.

Follow the Yellow Brick Road to the Software-Defined Future

November 29, 2012

The Software-Defined Datacenter is a great concept, but it just won’t work. The big enterprise companies will never allow VMware (and daddy EMC) to commoditize them out of existence, so useful implementations will be rarer than ruby slippers. The best we can hope for is point enhancements to enable greater virtual machine mobility through SDN and improved storage integration.

What is VMware VASA? Not Much (Yet)

November 11, 2011

VMware is adding storage integration features to their flagship vSphere server virtualization product line at a rapid pace. From backup to enterprise array offload, VMware is staking their claim. But information about one new storage feature in vSphere 5 has been scarce: The true nature of the Storage API for Storage Awareness (VASA) is only just beginning to be revealed.

Defining Failure: What Is MTTR, MTTF, and MTBF?

July 6, 2011

Most computer industry vendors use the term â€œMTBFâ€ rather indiscriminately. But IT pros know that systems do not magically repair themselves, at least not yet, so MTTR and MTTF are just as important!

The Rack Endgame: Open Compute Project

September 17, 2014

On reading my thoughts about the evolution of enterprise storage, many pointed out that this looks an awful lot like the Facebook-led Open Compute Project (OCP). This is entirely intentional. But OCP is simply one expression of this new architecture, and perhaps not the best one for the enterprise.

Fasting to Mitigate Jet Lag: Surprise! It Works!

February 11, 2013

I’m a frequent traveler, and thus a frequently suffer from moderate jet lag. It’s just so hard to adjust to a new time zone! But I recently stumbled on a simple method many claim helps your internal clock re-calibrate to travel. After trying it out on my trip to Australia last week, I’m convinced it can help!

FCoE vs. iSCSI – Making the Choice

May 20, 2011

iSCSI is an excellent choice in situations where Fibre Channel investment is nonexistent or badly in need of wholesale upgrade. FCoE, on the other hand, is likely to take over in high-end enterprise shops. It is relentlessly promoted by major vendors, and it seems that they will force the upgrade eventually.

It’s Fine To Mount Hard Drives On Their Side Or Even Upside-Down

August 13, 2016

Have you ever wondered if mounting a hard disk drive on its side or even upside-down affects its lifespan or reliability? According to every drive manufacturer, it’s perfectly acceptable to mount a hard disk drive in any orientation as long as it’s not tilted and has sufficient cooling.

PCIe vs. Ethernet

Rack-Scale Computing, OPCIe, and SiPh

Stephen’s Stance

You might also want to read these other posts...

Reader Interactions

Comments

Leave a Reply