• Skip to main content
  • Skip to primary sidebar
  • Home
  • About
    • Stephen Foskett
      • My Publications
        • Urban Forms in Suburbia: The Rise of the Edge City
      • Storage Magazine Columns
      • Whitepapers
      • Multimedia
      • Speaking Engagements
    • Services
    • Disclosures
  • Categories
    • Apple
    • Ask a Pack Rat
    • Computer History
    • Deals
    • Enterprise storage
    • Events
    • Personal
    • Photography
    • Terabyte home
    • Virtual Storage
  • Guides
    • The iPhone Exchange ActiveSync Guide
      • The iPhone Exchange ActiveSync Troubleshooting Guide
    • The iPad Exchange ActiveSync Guide
      • iPad Exchange ActiveSync Troubleshooting Guide
    • Toolbox
      • Power Over Ethernet Calculator
      • EMC Symmetrix WWN Calculator
      • EMC Symmetrix TimeFinder DOS Batch File
    • Linux Logical Volume Manager Walkthrough
  • Calendar

Stephen Foskett, Pack Rat

Understanding the accumulation of data

You are here: Home / Everything / Computer History / Storage Utilization Remains at 2001 Levels: Low!

Storage Utilization Remains at 2001 Levels: Low!

January 13, 2009 By Stephen 7 Comments

I’ve been talking about storage capacity utilization for my entire career, but the storage industry doesn’t seem to be getting anywhere. Every year or so, a new study is performed showing that half of storage capacity in the data center is unused. And every time there is a predictable (and poorly thought through) “networked storage is a waste of time” response.

The good news is that this is no longer a technical problem: Modern virtualized and networked servers ought to have decent utilization of storage capacity, and technology is improving all the time. Consider the compounded impact of modern technology on storage capacity utilization:

  • Shared storage (SAN and NAS) allows different servers to share a common pool of storage, reducing the likelihood that excess capacity will be stranded in isolated “puddles”. Pervasive use of NAS technology, and the rise of simple and inexpensive iSCSI SANs, means that every system in the modern data center can use shared storage.
  • Organizational and architectural optimization allows storage to be provisioned from a common pool rather than building “stovepipe systems” with their own resources. Quicker provisioning also helps reduce over-provisioning.
  • Network connectivity allows servers to share resources, including storage, on a peer-to-peer or client-server basis, ultimately resulting in things like cloud computing.
  • Managed and utility services reduce the impact of low utilization, potentially focusing on efficiency or perhaps passing the buck to a service provider.
  • Thin provisioning might help certain systems to keep less storage in reserve.

So why don’t things get better? It’s hard to be sure why people don’t use these pervasive tools to improve storage utilization, but I do have some ideas…

  • Storage utilization might not be a priority. Utilization isn’t often in the critical path of performance or availability, so overtaxed IT departments aren’t going to focus on it.
  • Incentives can be lacking. With the cost of storage constantly falling, the effort required to improve the efficiency of already-allocated storage can be just as easily spent migrating to a newer, cheaper storage platform.
  • Virtualization has perversely harmed the efficiency of allocation. One might think that the ease and flexibility of virtual disks would improve things, but it hasn’t. Server and storage virtualization just adds another place to hide unused storage.
  • Metrics remain a problem, since everyone gets all balled up trying even to talk about capacity utilization.

I think this last point is something we in the industry really ought to do something about. We say “utilization” but what do we mean? Chris Evans has proposed a set of metrics for the “storage waterfall“, and I mentioned back in October that this all boils down to three key metrics: Raw, usable, and used. The key question is where to apply them!

Way back before the 2001 bubble-burst, I managed professional services for a company called StorageNetworks. At that time, I was quite aggressive in pushing this same idea, even co-writing a whitepaper on the topic titled Measuring and Improving Storage Utilization. My co-author (Jonathan Lunt) and I recently reminisced about that paper, and we both agreed that everything in it still stands today, apart from the high dollar cost per gigabyte.

Each ratio along the storage waterfall can be diagnosed and improved
Each ratio in the storage waterfall can be diagnosed and improved

I suggest that the following key storage utilization ratios (taken directly from this paper) make just as much sense today as they did then:

  • Array Overhead is the percentage of installed storage capacity that is not usable. Dividing Array Usable by Array Raw and subtracting that number from 100% yields the percent of overhead. Overhead here is usually due to the desired level of data protection (e.g. RAID, mirroring) rather than to poor management.
  • Array Utilization is the percentage of usable array capacity that is allocated to hosts. It indicates the efficiency of storage deployment operations.
  • Allocation Efficiency reflects the ratio of storage presented or allocated to hosts to the amount actually seen by them. In many mature environments this ratio is near 100% (i.e. all the storage allocated is being seen), but this ratio can be extremely difficult to determine. It relies on accurate measurements of both Array Used storage and Host Raw.
  • Host Overhead reflects the amount of storage configured for use versus the amount the host can see. Since the Host Raw metric is a function of the storage administration team and the Host Usable a function of the systems administration team, this metric is a useful measurement of how well the two functions are cooperating. Data for this classification is collected from the host.
  • File System Utilization is the amount of available file system space that actually contains data. File system utilization is familiar to most systems administrators. This metric is often shown in simple system commands like “df” on UNIX or “dir” on Windows. Data for this classification is collected from the host.
  • Total Storage Utilization summarizes how well a company manages its storage assets across the entire business. This ratio is the default storage utilization metric used in publications and reflects the actual value an enterprise is deriving from its storage asset. Care is required in calculating this ratio to ensure that it accurately indicates utilization of the storage environment. Since the result of this ratio is often used in business cases and receives wide attention, it must be both logical and defendable.

To these, I would add another intermediate and optional set of virtualization metrics and ratios for environments with storage or server virtualization. One could also presumably add a higher-level set of application efficiency ratios as well.

In the paper, Jon and I also proposed three best practices to improve storage utilization:

  1. Drive Array Utilization (Array Usable to Array Used) to greater than 90% (a storage administration responsibility)
  2. Drive Allocation Efficiency: Bring Host Usable to be as close to Array Used as possible (a joint responsibility)
  3. Drive Filesystem Utilization (“Host Usable to Host Used”) above 80% (a systems administration responsibility)

Go read the paper and let me know what you think. Are we still stuck in 2001?

This post can also be found on Gestalt IT: Storage Utilization Remains at 2001 Levels: Low!

You might also want to read these other posts...

  • Electric Car Over the Internet: My Experience Buying…
  • GPS Time Rollover Failures Keep Happening (But…
  • What You See and What You Get When You Follow Me
  • Introducing Rabbit: I Bought a Cloud!
  • Liberate Wi-Fi Smart Bulbs and Switches with Tasmota!

Filed Under: Computer History, Enterprise storage, Gestalt IT, Virtual Storage Tagged With: capacity, iSCSI, Jonathan Lunt, metrics, NAS, network storage, SAN, server virtualization, shared storage, storage area network, storage utilization, storage virtualization, StorageNetworks, thin provisioning, whitepaper

Primary Sidebar

My favorite things in life don’t cost any money. It’s really clear that the most precious resource we all have is time.

Steve Jobs

Subscribe via Email

Subscribe via email and you will receive my latest blog posts in your inbox. No ads or spam, just the same great content you find on my site!
 New posts (daily)
 Where's Stephen? (weekly)

Download My Book


Download my free e-book:
Essential Enterprise Storage Concepts!

Recent Posts

Electric Car Over the Internet: My Experience Buying From Vroom

November 28, 2020

Powering Rabbits: The Mean Well LRS-350-12 Power Supply

October 18, 2020

Tortoise or Hare? Nvidia Jetson TK1

September 22, 2020

Running Rabbits: More About My Cloud NUCs

September 21, 2020

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

Remove ROM To Use LSI SAS Cards in HPE Servers

August 23, 2020

Test Your Wi-Fi with iPerf for iOS

July 9, 2020

Liberate Wi-Fi Smart Bulbs and Switches with Tasmota!

May 29, 2020

What You See and What You Get When You Follow Me

May 28, 2019

GPS Time Rollover Failures Keep Happening (But They’re Almost Done)

April 6, 2019

Symbolic Links

    Featured Posts

    Marketers: Fudging the Meaning of Buzzwords Matters (To You!)

    December 2, 2015

    Why I Am Biased Against FCoE

    October 21, 2011

    Faster Ethernet Gets Weird

    June 19, 2015

    Mac OS X Lion Adds CoreStorage, a Volume Manager (Finally!)

    August 4, 2011

    Sony NEX-5 Camera Review

    September 15, 2010

    What’s the Deal with Containers?

    October 21, 2016

    Here’s Something Your Raspberry Pi Can’t Do: Gigabit Ethernet and SATA in the Olimex A20-OLinuXIno-LIME2

    May 25, 2016

    Making a Case For (and Against) Software-Defined Storage

    January 9, 2014

    My Visit to Bletchley Park

    August 3, 2012

    Virtualized and Distributed Storage: This Time For Sure!

    September 2, 2014

    Copyright © 2021 · Log in