• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • About
    • Stephen Foskett
      • My Publications
        • Urban Forms in Suburbia: The Rise of the Edge City
      • Storage Magazine Columns
      • Whitepapers
      • Multimedia
      • Speaking Engagements
    • Services
    • Disclosures
  • Categories
    • Apple
    • Ask a Pack Rat
    • Computer History
    • Deals
    • Enterprise storage
    • Events
    • Personal
    • Photography
    • Terabyte home
    • Virtual Storage
  • Guides
    • The iPhone Exchange ActiveSync Guide
      • The iPhone Exchange ActiveSync Troubleshooting Guide
    • The iPad Exchange ActiveSync Guide
      • iPad Exchange ActiveSync Troubleshooting Guide
    • Toolbox
      • Power Over Ethernet Calculator
      • EMC Symmetrix WWN Calculator
      • EMC Symmetrix TimeFinder DOS Batch File
    • Linux Logical Volume Manager Walkthrough
  • Calendar

Stephen Foskett, Pack Rat

Understanding the accumulation of data

You are here: Home / Everything / Enterprise storage / How Did Microsoft and Intel Get 1 Million iSCSI IOPS?

How Did Microsoft and Intel Get 1 Million iSCSI IOPS?

March 19, 2010 By Stephen 3 Comments

How fast can iSCSI get?Ever since Microsoft and Intel declared that the combination of Windows and Nehalem could deliver over a million iSCSI IOPS, I’ve been curious about just how they did it. What black magic could push that many I/Os over a single Ethernet connection? And what was on the other end? Now Intel has revealed all in a whitepaper, and the results are surprising!

What iSCSI Did

Let’s review the test for a moment. In March, Microsoft and Intel demonstrated that the combination of Windows Server 2008 R2 and the Xeon 5500 could saturate a 10 Gb Ethernet link, pushing iSCSI throughput to wire speed. That’s 1,174 MB/s, right around the theoretical maximum of a ten-gigabit link, given a tiny bit of overhead. The pair reunited in January to show that this same combination could deliver an astonishing million I/O operations per second, too.

Both of these results are astonishing. Sure, many high-end Fibre Channel SANs and storage systems blast out gigabytes of data and millions of I/O operations every second, but these tests are much more focused. Benchmarks are perilous, but the folks at Microsoft and Intel devised a fairly clever and focused set. Rather than a “mine’s bigger” contest, the pair only needs to prove that iSCSI can play with the pros.

The side effect is a demonstration of the capabilities of Microsoft and Intel components. Microsoft showed off the capabilities of Windows Server 2008 R2, Hyper-V, and their software iSCSI initiator, while Intel can brag about the Xeon 5500 server platform and X520-2 10 Gb Ethernet Server Adapter with their 82599EB controller. Your mileage may vary, but it is possible to construct a true storage monster on an average server budget.

Intel Inside

Let’s start by looking at the configuration of the local end of the tested configuration. I’m a storage guy so I think of it as the initiator, but you might say it’s the server, the client, or the host. Regardless, the system under test (SUT) is what was put under the microscope. The configuration was a common one: A high-end computer packing an Intel Xeon CPU and 82599-based 10 Gb Ethernet adapter. Most data centers have a machine or two just like this one.

Looking closely, we see that the test in question relied on the following key components:

  • Intel’s “Shady Cove” S5520SC workstation-class motherboard
  • The Intel Xeon W5580 CPU (4 cores, 8 MB cache, 3.20 GHz)
  • 24 GB of DDR3 RAM
  • Intel “Niantic” 82599EB 10 Gb Ethernet controller
  • Microsoft Windows Server 2008 R2 x64

This combination would set you back about $7,500 – $450 for the motherboard, $1,500 for the CPU, 6 2 GB DDR3 SDRAM modules at $80 each, $1,200 for the Intel X520 NIC, and $4,000 for an Enterprise copy of Windows Server 2008 R2. Not cheap, but not an exotic server either.

Initiate and Optimize

The secret to push the tested system to perform like it did is in the optimizations in the server platform, the NIC, and Windows Server itself.

  • The Xeon 5500 processor series includes many enhancements:
    • An integrated memory controller allows for faster RAM access
    • QuickPath interconnect (QPI) replaces the old front-side bus and enhances I/O off the core
    • A new I/O subsystem with PCIe integrated into the CPU
    • MSI-X expands the number of interrupts a PCI device can use
    • New instructions for on-board CRC-32C decoding, speeding up iSCSI digest processing
  • The 82599 Ethernet controller also includes enhanced capabilities:
    • VMDq maps I/O queues to multiple cores and virtual machines, reducing I/O bottlenecks
    • Offload of TCP segmentation and receive-side coalescing
    • Interestingly, it does not appear that VMDc/SR-IOV was employed in the test
  • Microsoft Windows Server 2008 R2 and Hyper-V are ready to use all of these features and more:
    • R2 uses multi-core CPUs more effectively in general
    • Receive-side scaling (RSS) spreads the I/O workload across all four Xeon cores
    • The iSCSI initiator now allows CRC digest offload (using the new Xeon command set)
    • Numerous “NUMA I/O” optimizations in the initiator
    • TCP/IP Nagle can be disabled in the registry
    • Hyper-V VMQ allows the network packets to be copied directly into the guest virtual machine’s memory

Whew! Put all of these optimizations in a blender and Hyper-V virtual machine iSCSI access will be twice as fast as before. No kidding!

Stay On Target

But we knew all of this back in January. We also saw that a Cisco Nexus 5020 switch was used to fan out to 10 software iSCSI targets. But until now there was no mention of what targets were used exactly.

The final footnotes in Intel’s whitepaper reveals that the storage backing the million IOPS test was none other than StarWind Software‘s iSCSI SAN! It is unclear what led Microsoft and Intel to use this particular iSCSI target (the earlier throughput tests ran on NetApp filers), but it does speak to the quality of this product.

It is not clear how many disk drives were used, but I would guess that SSDs or ramdisks might have been employed to pull a million IOPS. Network optimizations are also not mentioned, though jumbo frames would not be a benefit in an IOPS test.

StarWind’s software runs on Microsoft Windows and creates a full-featured iSCSI target, complete with data mirroring, automatic failover and failback, replication, snapshots, and thin provisioning. The company prices their iSCSI SAN at $6,000 for two nodes and competes with the likes of DataCore and Open-E. But the StarWind solution seems at a glance to be more full-featured than these other offerings.

Try It Yourself!

I imagine many folks like me might be tempted to try to reproduce these results. More valuable would be a set of best practice guidelines for the deployment of software iSCSI in Windows Server 2008 R2 and Hyper-V environments. Given the relatively modest hardware involved, there should be nothing stopping us!

These test results also prompted me to get in touch with StarWind to try their iSCSI target software. I was pleasantly surprised to learn that they are currently offering free non-production licenses to VMware vExperts, VCPs, and VCIs as well as Microsoft MVPs, MCPs, and MCT Professionals. Many of my readers fall into one (or more) of those buckets, and I applaud the company for this offer. If only more companies realized the value in giving away test licenses to influencers and thought leaders!

You might also want to read these other posts...

  • Electric Car Over the Internet: My Experience Buying From…
  • How To Connect Everything From Everywhere with ZeroTier
  • Scam Alert: Fake DMCA Takedown for Link Insertion
  • Introducing Rabbit: I Bought a Cloud!
  • Liberate Wi-Fi Smart Bulbs and Switches with Tasmota!

Filed Under: Enterprise storage, Gestalt IT, Virtual Storage Tagged With: 10 gigabit Ethernet, benchmarks, CRC, CRC32C, DataCore, Ethernet, Hyper-V, Intel, IOPS, iSCSI, Microsoft, MSI-X, Nagle, Open-E, performance, QPI, RSS, SR-IOV, StarWind, TCP offload, throughput, VMDc, VMDq, VMQ, Windows Server 2008 R2, Xeon

Primary Sidebar

Technology is anything that wasn’t around when you were born.

Alan Kay

Subscribe via Email

Subscribe via email and you will receive my latest blog posts in your inbox. No ads or spam, just the same great content you find on my site!
 New posts (daily)
 Where's Stephen? (weekly)

Download My Book


Download my free e-book:
Essential Enterprise Storage Concepts!

Recent Posts

How To Install ZeroTier on TrueNAS 12

February 3, 2022

Scam Alert: Fake DMCA Takedown for Link Insertion

January 24, 2022

How To Connect Everything From Everywhere with ZeroTier

January 14, 2022

Electric Car Over the Internet: My Experience Buying From Vroom

November 28, 2020

Powering Rabbits: The Mean Well LRS-350-12 Power Supply

October 18, 2020

Tortoise or Hare? Nvidia Jetson TK1

September 22, 2020

Running Rabbits: More About My Cloud NUCs

September 21, 2020

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

Remove ROM To Use LSI SAS Cards in HPE Servers

August 23, 2020

Test Your Wi-Fi with iPerf for iOS

July 9, 2020

Symbolic Links

    Featured Posts

    vSphere 6: NFS 4.1 Finally Has a Use?

    February 3, 2015

    Instapaper for iPad and iPhone Enhances My Web World

    June 1, 2010

    How Will Cisco Recover From The Consumer Strategy Blunder?

    January 2, 2013

    Mac OS X Lion Adds CoreStorage, a Volume Manager (Finally!)

    August 4, 2011

    How to Get Me to Write about Your Company or Product

    March 15, 2012

    The iPhone Revolution 10 Years Later

    January 9, 2017

    Making a Case For (and Against) Software-Defined Storage

    January 9, 2014

    The Rack Endgame: A New Storage Architecture For the Data Center

    September 3, 2014

    Ranting and Raving About the 2018 iPad Pro

    November 11, 2018

    My 2012 Project: Improving Energy Efficiency

    January 3, 2012

    Footer

    Legalese

    Copyright © 2022 · Log in