• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • About
    • Stephen Foskett
      • My Publications
        • Urban Forms in Suburbia: The Rise of the Edge City
      • Storage Magazine Columns
      • Whitepapers
      • Multimedia
      • Speaking Engagements
    • Services
    • Disclosures
  • Categories
    • Apple
    • Ask a Pack Rat
    • Computer History
    • Deals
    • Enterprise storage
    • Events
    • Personal
    • Photography
    • Terabyte home
    • Virtual Storage
  • Guides
    • The iPhone Exchange ActiveSync Guide
      • The iPhone Exchange ActiveSync Troubleshooting Guide
    • The iPad Exchange ActiveSync Guide
      • iPad Exchange ActiveSync Troubleshooting Guide
    • Toolbox
      • Power Over Ethernet Calculator
      • EMC Symmetrix WWN Calculator
      • EMC Symmetrix TimeFinder DOS Batch File
    • Linux Logical Volume Manager Walkthrough
  • Calendar

Stephen Foskett, Pack Rat

Understanding the accumulation of data

You are here: Home / Everything / Computer History / Granularity of Thin Provisioning Approaches

Granularity of Thin Provisioning Approaches

January 10, 2011 By Stephen 12 Comments

One of the topics I've often written and spoken about is thin provisioning. This series of 11 articles is an edited version of my thin provisioning presentation from Interop New York 2010. I hope you enjoy it!

Although I consider it the main stumbling block for thin provisioning, communication (or lack thereof) is being addressed with metadata monitoring, WRITE_SAME, the Veritas Thin API, and other ideas. But communication isn’t the only issue.

Let’s talk about page sizes. You’ll often see vendors tossing this “softball” objection at their competitors, claiming that their (smaller) page size makes for more-effective thin provisioning. And that’s true, to a some extent, but perhaps not the end of the story.

Look at the top block in this stack. The light background box is the page, and the colored boxes represent data. If your storage is written in “pages” of this size, you can’t thin it.

What if we used a smaller page? What if my page is a quarter of that size, as in the second row? I still can’t thin it out, because my data is spread all over the place.

Remember worrying about fragmentation back in the days of DOS and Windows and FAT filesystems? It’s kind of like this.

Because we’re using zero page reclaim, the whole page has to be zero to be reclaimed. If your data is all over the place, if there’s even one bit that’s not zero on a page, we’re not going to reclaim that whole page.

Now let’s return to our illustration. If we use a little bit smaller page, as in the bottom two rows, we can reclaim some space. If we use a really tiny page, we can reclaim half the space even.

We’re still not reclaiming all the space, though. At the beginning of this series, I showed the “simplified perfect-world” thin provisioning illustration. In that picture, the half-empty barrel was perfectly reclaimed thanks to this technology. We will never get there unless we are using really minuscule pages. But we can get somewhat close. Maybe we can thin out three-quarters of the empty space.

But some vendors use really big pages. Some folks made fun of Hitachi for using 42 megabyte pages, since, if there’s one bit in 42 megabytes of potential ones or zeros, the Hitachi will not thin that. It also won’t migrate it for automated storage tiering. But others use even-bigger pages; up to a gigabyte in size. And 42 MB isn’t that bad in practice.

I know of a company that’s doing four-kilobyte pages. And EMC actually allocates one-gigabyte slices of storage for writing on the CLARiiON, even though their thin size is 8 KB. So is the CLARiiON page size 8 KB or 1 GB? It’s very confusing to me (and probably the customer too)…

The trouble with 4 K or 8 K pages is it makes an awful lot of pages to keep track of. Consider the analogy of hard disk drive sector sizes. An ATA disk could only get to 2.1 terabytes until recently, because they still used 512-byte sectors. And 512 bytes times the biggest 32-bit number is 2048 GB. So 512 bytes makes for greater efficiency in theory, but hurts scalability in practice. So, the disk drive industry is moving to 4 K sectors.

It’s exactly the same thing as with thin provisioning. So, you’ve got to keep track of all these gazillions and gazillions of pages. So, from a vendor perspective, you can save a lot of horsepower and make it a lot easier to implement if you have bigger pages. It also means you’re not moving stuff around as much when using these big pages for automated tiering.

I’m not going to throw rocks at HDS or anyone else over page sizes. I actually don’t think 42 MB is that bad, because the biggest problem with underutilization is not inside a file system. In my experience, the big problem is storage that’s not used at all.

When I used to do storage assessments, it was very common to find LUNs that were allocated ant not used at all; not even touched. Your page size doesn’t matter if a LUN is not even touched: It’s going to be thinned out no matter what. So, regardless of the page size, thin provisioning will probably save more space outside a filesystem than within one, especially if your systems administrators are doing a reasonably good job of storage management. And even if they’re not doing a good job, there’s probably 42 megs of zeros that can be thinned out anyway.

So, I’m not as worried about the size of the pages. Granularity is an architectural decision, and larger pages are not the end of the world. Ask your vendor if they support thin provisioning and what the granularity or page size is, and think about how that’s going to affect you. At the end of the day, it’s probably going to yield about the same result no matter what the page size is.

You might also want to read these other posts...

  • Electric Car Over the Internet: My Experience Buying From…
  • Liberate Wi-Fi Smart Bulbs and Switches with Tasmota!
  • How To Connect Everything From Everywhere with ZeroTier
  • How To Install ZeroTier on TrueNAS 12
  • Introducing Rabbit: I Bought a Cloud!

Filed Under: Computer History, Enterprise storage, Everything, Virtual Storage Tagged With: 4K, CLARiiON, EMC, granularity, HDP, HDS, page size, thin provisioning, virtual provisioning

Primary Sidebar

The same thing can be identified by many different terms, and the same term may mean many different things.

Douglas John Foskett

Subscribe via Email

Subscribe via email and you will receive my latest blog posts in your inbox. No ads or spam, just the same great content you find on my site!
 New posts (daily)
 Where's Stephen? (weekly)

Download My Book


Download my free e-book:
Essential Enterprise Storage Concepts!

Recent Posts

How To Install ZeroTier on TrueNAS 12

February 3, 2022

Scam Alert: Fake DMCA Takedown for Link Insertion

January 24, 2022

How To Connect Everything From Everywhere with ZeroTier

January 14, 2022

Electric Car Over the Internet: My Experience Buying From Vroom

November 28, 2020

Powering Rabbits: The Mean Well LRS-350-12 Power Supply

October 18, 2020

Tortoise or Hare? Nvidia Jetson TK1

September 22, 2020

Running Rabbits: More About My Cloud NUCs

September 21, 2020

Introducing Rabbit: I Bought a Cloud!

September 10, 2020

Remove ROM To Use LSI SAS Cards in HPE Servers

August 23, 2020

Test Your Wi-Fi with iPerf for iOS

July 9, 2020

Symbolic Links

    Featured Posts

    ZFS Is the Best Filesystem (For Now…)

    July 10, 2017

    Regarding My Symbolic Links and Good Reads

    April 16, 2015

    A Fairy Tale of Two Storage Protocols

    September 23, 2014

    What’s the Deal with Containers?

    October 21, 2016

    The End of Unlimited Data – Part 1: The Buffet

    June 2, 2010

    The Prime Directive of Storage: Do Not Lose Data

    December 12, 2014

    What’s (Still) Wrong With Dropbox For Business

    April 17, 2013

    Microsoft’s Overlooked Innovation

    February 15, 2010

    Why Buy a NEX-7? Why Sony NEX At All?

    October 17, 2011

    The 2018 iPad Pro is a Beast!

    November 9, 2018

    Footer

    Legalese

    Copyright © 2022 · Log in