Flush Time

October 19, 2009 by Stephen 5 Comments

Single-parity RAID is under attack. Caching is the hottest trend in storage. The end of the high-performance disk drive is imminent. What happened? Increasing areal bit density has caused disk capacity to grow much faster than disk performance. A presentation at Storage Networking World by Ronald Bianchini of Avere exposed the mathematics of this phenomenon. Of course, hard disk platters are not getting larger – quite the opposite. But the bits are getting smaller, so the effect is the same:

Capacity increases exponentially based on the formula for the area of a disc: Pi times radius squared
Sequential performance increases algebraically based on the formula for the circumference of a disc: Pi times diameter

Therefore, sequential performance grows smoothly with disk density, but capacity increases much faster. Double the density of disk media and you can read twice as many bits in the same amount of time, but the disk now contains four times as much data. Iterate this a dozen times, a miracle performed regularly by hard disk drive manufacturers, and you have a serious bottleneck to both performance and reliability.

Disk capacity has outpaced performance over the last decade

Back in 2004, I gave this metric a name: Flush time. It is a simple calculation to answer the question, how long would it take to read the entire content of a hard disk drive? Let’s look at some real-world examples:

In 2000, a 45 GB Western Digital 450AA disk could stream data at 25.4 MB/s, requiring 30 minutes to flush every byte out its UDMA/66 interface. This was a massive and slow drive at the time – enterprise disks were much faster. A 2000 Quantum Atlas 10K II SCSI drive (36 GB and 31 MB/s) could flush in 19 minutes!
A 2004-era Seagate Barracuda 7200.7 boasted 160 GB ad averaged 44.5 MB/s, requiring about an hour for a full flush.
By 2007, high-performance drives like the Hitachi 15K450 had hit 450 GB and about 100 MB/s in sustained throughput, but flush times were well over an hour.
Today’s enterprise drives can push 200 MB/s and average 160 MB/s across the entire 600 GB of capacity. But this is still about an hour for a flush. But large-capacity SATA drives are much more popular for bulk storage. The Samsung Spinpoint F2 EcoGreen drive I use in my Drobo only delivers about 110 MB/s, requiring almost four hours to flush at 1.5 TB of capacity! Think this is unusual? Check out Hitachi’s popular E7K1000, which needs 2.5 hours at 1 TB and 118 MB/s.

What will happen to flush time over the next decade if density continues to increase? — What will happen to flush time over the next half decade if density continues to increase? How about 16 TB drives, 400 MB/s, and RAID rebuilds that last more than half a day?

Since (traditional) RAID rebuilds are directly impacted by flush time, today’s massive disk drives are killing RAID. And flush time is only the minimum required time – most RAID rebuilds take much longer! Then there is the issue of media reliability!

Note: Yes, I know there are alternative RAID schemes that get around this problem. Far from ignoring that point, I’ll be promoting these in future posts! Stay tuned for more on these topics…

You might also want to read these other posts...

Comments

deemery says

October 19, 2009 at 3:04 pm

I’m bumping up against this as I configure a new machine by doing full-disk backup and restores. Frankly, I wasn’t expecting great performance, since I’m dumping drives between a Mac Mini with a 5400 rpm drive and a hardware RAID FireWire 800 drive (Venus DS3R) with older IDE drives. But it still takes a while, and these numbers show things won’t get much better any time soon on my low end 🙂

dave

p.s. I’m considering replacing those Venus enclosures, which have given me good service except for a noisy fan, with either a Drobo (expensive!) or a 2 drive RAID Mirror enclosure (cheaper but more restrictive)
sfoskett says

October 19, 2009 at 3:13 pm

You’re not alone, Dave! RAID rebuild on my Drobo takes about 8 hours with two 1.5 TB drives and one 1 TB drive. And slow interfaces and random access don’t help – I’m only getting 25-35 MB/s on my Drobo and Iomega ix4, meaning it will take more like 12 hours to copy a single terabyte of data. Raw USB or FireWire enclosures might do somewhat better, but there’s a massive risk of data loss with single-drive units, something I’ll be writing about soon.

My advice is to buy a Drobo. It’s not all that fast, and it is expensive, but it allows you to have a protected place to put your data that you can upgrade as time goes by. Need more space? Pop in another drive and let it do the work. No migration required.
deemery says

October 19, 2009 at 4:16 pm

One Drobo costs about 2 1/2 times OWC Raid Mirrored enclosures. A
(presumably unlikely) hardware failure (e.g. power supply) on the Drobo
would leave me hanging. I’m just not convinced the cost is worth it
right now. The ‘sweet spot’ for me for a Drobo would be $200, and that
doesn’t look likely.

But thanks much for the Drobo notes, they helped me understand this a
lot better than anything else I’ve read.

dave

p.s. just got Snow Leopard server and a 2nd Mac Mini. I already have
Leopard server on another Mini, that serves as my ‘inside server’. SL
on the new Mini will replace the old G4/933 that’s running Tiger server
as my externally facing machine. I want to see if I can get IPNetRouter
to work on SL on the Mini, that’s a product I’ve had a lot of good
experience with in the past, but it wasn’t working quite right on the
G4/Tiger Server for some reason.
johnmartinoz says

October 27, 2009 at 2:23 am

I sometime have fun telling people that in effect, that disk drives are getting slower over time … if you graph the IOPS/MB over those disk curves above you get a scary decending line. On a slightly different note, it’s not just full backups that this affects. Remember in order to protect against “bit rot” most array vendors rely on performing regular RAID scrubs, these rarely happen at the maximum rate for the drive (flush time above) in order to prevent too great a performance impact.

As a result, it’s unlikey that as disk drives get bigger that RAID scrubs will happen in a timely manner which will increasingly expose systems to a Media Error on Data Rebuild (MEDR) – i.e. if you do have to reconstruct your disk from parity, then one or more of your blocks wont read correctly and your RAID reconstruct fails. The affects any N+1 redundancy scheme including RAID-10. Unless you’re using dual parity techniques, you’d better hope you’ve got good backups, which as disks get bigger, will also get harder and harder to do.

If you’re interested in the math for this and how it affects reliability, check out “A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID) by Jon G. Elerath and Michael Pecht in IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 3, MARCH 2009 a copy of which can be found here http://media.netapp.com/documents/rp-0046.pdf”
johnmartinoz says

October 27, 2009 at 2:23 am

I sometime have fun telling people that in effect, that disk drives are getting slower over time … if you graph the IOPS/MB over those disk curves above you get a scary decending line. On a slightly different note, it's not just full backups that this affects. Remember in order to protect against “bit rot” most array vendors rely on performing regular RAID scrubs, these rarely happen at the maximum rate for the drive (flush time above) in order to prevent too great a performance impact.

As a result, it's unlikey that as disk drives get bigger that RAID scrubs will happen in a timely manner which will increasingly expose systems to a Media Error on Data Rebuild (MEDR) – i.e. if you do have to reconstruct your disk from parity, then one or more of your blocks wont read correctly and your RAID reconstruct fails. The affects any N+1 redundancy scheme including RAID-10. Unless you're using dual parity techniques, you'd better hope you've got good backups, which as disks get bigger, will also get harder and harder to do.

If you're interested in the math for this and how it affects reliability, check out “A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID) by Jon G. Elerath and Michael Pecht in IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 3, MARCH 2009 a copy of which can be found here http://media.netapp.com/documents/rp-0046.pdf“

You might also want to read these other posts...

Reader Interactions

Comments

Leave a Reply