• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • About
    • Stephen Foskett
      • My Publications
        • Urban Forms in Suburbia: The Rise of the Edge City
      • Storage Magazine Columns
      • Whitepapers
      • Multimedia
      • Speaking Engagements
    • Services
    • Disclosures
  • Categories
    • Apple
    • Ask a Pack Rat
    • Computer History
    • Deals
    • Enterprise storage
    • Events
    • Personal
    • Photography
    • Terabyte home
    • Virtual Storage
  • Guides
    • The iPhone Exchange ActiveSync Guide
      • The iPhone Exchange ActiveSync Troubleshooting Guide
    • The iPad Exchange ActiveSync Guide
      • iPad Exchange ActiveSync Troubleshooting Guide
    • Toolbox
      • Power Over Ethernet Calculator
      • EMC Symmetrix WWN Calculator
      • EMC Symmetrix TimeFinder DOS Batch File
    • Linux Logical Volume Manager Walkthrough
  • Calendar

Stephen Foskett, Pack Rat

Understanding the accumulation of data

You are here: Home / Everything / Enterprise storage / The Storage Utilization Waterfall: Raw, Usable, and Used

The Storage Utilization Waterfall: Raw, Usable, and Used

October 1, 2008 By Stephen 3 Comments

Based on Floral Matryoshka by BrokenSphere/Wikimedia Commons
Based on Floral Matryoshka by BrokenSphere/Wikimedia Commons

My February 2003 column for Storage magazine focused on the surprising difficulty of measuring storage utilization. I wrote:
 

“A true measurement of utilization would reflect every layer of usage metrics – from raw disk in a shared array to used storage within files. Raw storage for each new frame of reference is contained within the used storage measured above it, so low utilization is compounded as we move deeper into the stack.”

In that column, I suggested that utilization of any resource was based on just three metrics:

  1. Raw
  2. Usable
  3. Used

But this is confounded by the frame of reference being measured. It’s trivially simple to determine the raw, usable, and used capacity for a storage array, server, or database. But what happens when one tries to measure storage utilization all the way through the stack?

When vendors take up this challenge, the discussion tends to get diverted into a cul-de-sac that presents their products most favorably, as was the case of Chuck Hollis’ comparison of his EMC CLARiiON to HP’s and NetApp’s storage products. Was Chuck wrong? Was HP right? Or was it NetApp that has the best utilization? One thing is certain, we’re getting nowhere if we can’t agree on some basic terminology.

Chris Evans' Storage Waterfall
Chris Evans' Storage Waterfall

Credit Storage Architect Chris Evans with seeing the problem for what it was. He noticed the matryoshka effect and put together a “waterfall” diagram, showing how low utilization is compounded as we move down the stack. He also notes that complexity rises as we move to the right, something I never called out.

We were both onto the same thing, though, and my study of storage utilization (published in the April 2003 issue) supported his suggestion that the raw to used ratio might be as little as 10:1 on average. At the time, I even put together a similar waterfall chart, but it was never published outside the company I worked for (that I know of).

So I fully and enthusiastically support Chris’ ideas on this topic! Let’s come up with some standard metrics for the various places that storage can be “raw, usable, and used”:

  1. Disk drive units often have excess space (raw), and this is especially true of enterprise flash units
  2. RAID sets definitely follow this pattern
  3. Storage arrays themselves can have unused usable space (as noted by Marc Farley)
  4. Storage virtualization can add another layer of utilization loss
  5. On the host side, we must consider volume managers which can perform all the functions of an array
  6. Filesystems also have raw, usable, and used space
  7. As do applications that manage storage like databases
  8. Add in capacity management technologies like compression and deduplication to really mess things up
  9. Finally, server virtualization can sit above or below these server variables, and virtual machines themselves often have unused space.

Simply put, there are a lot of places for a few unused bytes to hide. Anyone want to bet that 10:1 is optimistic? And we’re only talking about capacity utilization – there are whole other worlds of power efficiency and performance to consider as well…

You might also want to read these other posts...

  • Electric Car Over the Internet: My Experience Buying From…
  • How To Install ZeroTier on TrueNAS 12
  • Tortoise or Hare? Nvidia Jetson TK1
  • How To Connect Everything From Everywhere with ZeroTier
  • Introducing Rabbit: I Bought a Cloud!

Filed Under: Enterprise storage, Virtual Storage Tagged With: data management, deduplication, RAID, storage utilization, storage virtualization, utilization

Primary Sidebar

We can only see a short distance ahead, but we can see plenty there that needs to be done

Alan Turing

Subscribe via Email

Subscribe via email and you will receive my latest blog posts in your inbox. No ads or spam, just the same great content you find on my site!
 New posts (daily)
 Where's Stephen? (weekly)

Download My Book


Download my free e-book:
Essential Enterprise Storage Concepts!

Recent Posts

How To Install ZeroTier on TrueNAS 12

February 3, 2022

Scam Alert: Fake DMCA Takedown for Link Insertion

January 24, 2022

How To Connect Everything From Everywhere with ZeroTier

January 14, 2022

Electric Car Over the Internet: My Experience Buying From Vroom

November 28, 2020

Powering Rabbits: The Mean Well LRS-350-12 Power Supply

October 18, 2020

Tortoise or Hare? Nvidia Jetson TK1

September 22, 2020

Running Rabbits: More About My Cloud NUCs

September 21, 2020

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

Remove ROM To Use LSI SAS Cards in HPE Servers

August 23, 2020

Test Your Wi-Fi with iPerf for iOS

July 9, 2020

Symbolic Links

    Featured Posts

    Datacenter History: Through the Ages in Lego

    October 22, 2013

    Why Big Disk Drives Require Data Integrity Checking

    December 19, 2014

    How To Connect Everything From Everywhere with ZeroTier

    January 14, 2022

    How To Keep Your Family Activities In Sync With A Shared Google Calendar

    April 18, 2010

    Donate Your Swag to School Kids In Need

    July 28, 2010

    Ten Terrible Apple Products

    June 14, 2012

    From LAN Manager and SMB to CIFS: The Evolution of Prehistoric PC Network Protocols

    March 22, 2012

    The Four Horsemen of Storage System Performance: The Rule of Spindles

    August 25, 2010

    Cisco’s Trojan Horse

    September 15, 2014

    How to Get Me to Write about Your Company or Product

    March 15, 2012

    Footer

    Legalese

    Copyright © 2022 · Log in