<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>Stephen Foskett, Pack Rat &#187; spindles Archives  &#8211; Stephen Foskett, Pack Rat</title>
	<atom:link href="http://blog.fosketts.net/tag/spindles/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fosketts.net</link>
	<description>Understanding the accumulation of data</description>
	<lastBuildDate>Fri, 10 Feb 2012 17:40:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com" />
	<atom:link rel="hub" href="http://superfeedr.com/hubbub" />
			<item>
		<title>The Four Horsemen of Storage System Performance: The Rule of Spindles</title>
		<link>http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/</link>
		<comments>http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 15:09:25 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[Terabyte home]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[4 horsemen]]></category>
		<category><![CDATA[density]]></category>
		<category><![CDATA[disk performance]]></category>
		<category><![CDATA[form factors]]></category>
		<category><![CDATA[hard disk drives]]></category>
		<category><![CDATA[latency]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[seek time]]></category>
		<category><![CDATA[spindle speed]]></category>
		<category><![CDATA[spindles]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=3603</guid>
		<description><![CDATA[Why do some data storage solutions perform better than others? Mechanical performance, RAM caching, I/O capacity, and the intelligence of the system all have a part to play. Today we examine the rule of spindles: Adding more disk spindles is generally more effective than using faster spindles.]]></description>
			<content:encoded><![CDATA[<div id="attachment_3604" class="wp-caption aligncenter" style="width: 410px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://blog.fosketts.net/wp-content/uploads/2010/08/Four-Horsemen-400.png" ><img class="size-full wp-image-3604" title="Four Horsemen-400" src="http://blog.fosketts.net/wp-content/uploads/2010/08/Four-Horsemen-400.png" alt="" width="400" height="309" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">The Four Horsemen of Storage System Performance: These four ugly gentlemen stand between you and your data.</p></div>
<p>Why do some data storage solutions perform better than others? What tradeoffs are made for economy and how do they affect the system as a whole? These questions can be puzzling, but there are core truths that are difficult to avoid. Mechanical disk drives can only move a certain amount of data. RAM caching can improve performance, but only until it runs out. I/O channels can be overwhelmed with data. And above all, a system must be smart to maximize the potential of these components. These are the four horsemen of storage system performance, and they cannot be denied.</p>
<h3>The Nature of Disks</h3>
<p>Hard disk drives are getting faster all the time, but they are mechanical objects subject to the laws of physics. They spin, their heads move to seek data, they heat up and are sensitive to shock. Storage industry insiders recognize the physicality of hard disk drives in the name we apply to them: Spindles. And there is no way to escape the bounds of a spindle.</p>
<p>The performance of a hard disk drive is constrained by both its physical limitations and how we use it. Physically, a hard disk drive must spin its platters under a moving arm with a read/write head at the tip. This arm slides across the media, creating a two-dimensional map of data across the disk. Hard disk drives spin at a constant speed, so data at the edge passes under the head quicker than data at the center, creating a distinctive curve of performance.</p>
<p>Although they are random-access devices, hard disk drives cannot access multiple locations at once. Although modern command queueing and processing allows the drive controller to optimize access, I/O operations are serialized before the drive can act on them. It takes a moment for the head to move (seek time) and the disk to spin (rotational latency) before data can be accessed, so sequential operations are much faster than random ones.</p>
<p>Most operating systems lay data out sequentially, beginning at the edge of the disk and moving inward. Although modern file systems try to keep individual files contiguous and optimize placement to keep similar data together, seeking is inevitable. This is the nature of physical hard disk drives.</p>
<h3>Accelerating Disks</h3>
<p><a href="http://blog.fosketts.net/wp-content/uploads/2010/08/Disk-Performance-200.png" ><img style=' float: right; padding: 4px; margin: 0 0 2px 7px;'  class="alignright size-full wp-image-3605" title="Disk Performance-200" src="http://blog.fosketts.net/wp-content/uploads/2010/08/Disk-Performance-200.png" alt="" width="200" height="212" /></a>Disks can be made faster mechanically in one of three ways:</p>
<ol>
<li>Data can be packed closer together, allowing more to be read in a given amount of time. This is a natural outgrowth of storage density improvements, and explains why today&#8217;s slowest hard disks have much better sequential access performance than the fastest enterprise drives of just a few years past.</li>
<li>Spindle speeds can be accelerated. This has a dual benefit of passing more data under the heads in the same amount of time and reducing the time it takes for the proper spot on the disk to pass the heads. Maximum spindle speed has remained constant at 15,000 rpm for decades, but the slowest drives have accelerated in recent years, with 5,400 now the minimum and 7,200 rpm now common even for portable devices.</li>
<li>Disk media can be made smaller, reducing the area the heads must travel across. Increasing media density allows for the viability of smaller disks outside volume-sensitive areas like laptops. Many high-speed enterprise drives use small-diameter media even where the outer shell remains at the standard 3.5&#8243; form factor.</li>
</ol>
<p>Each of these methods are limited by practicality, however. Doomsayers have been declaring that density is reaching the limits of technology for decades, and while the industry has always surpassed these expectations, it can only adapt so far. Spindle speeds beyond 15,000 rpm are impractical as well, especially since this limits the density of storage used. And geometry limits how small hard disk media can get: The spindle itself requires some space, leaving less area for data storage. In fact, it appears that today&#8217;s 15,000 rpm 2.5&#8243; hard disk drives are as quick and small as practicality allows. Only increasing density will improve their native performance further.</p>
<h3>Combining Spindles</h3>
<p>Although they are quick, the mechanical limitations of hard disk drives makes them the first suspect in cases of poor storage performance. A single modern hard disk drive can easily read and write over 100 MB per second, with the fastest drives pushing twice that much data. But most applications do not make this sort of demand. Instead, they ask the drive to seek a certain piece of data, introducing latency and reducing average performance by orders of magnitude.</p>
<p>Then there is the I/O blender of multitasking operating systems and virtualization. Just as each application requests data spread across a disk, multitasking operating systems allow multiple applications and process threads to request their own data at once. File system development has lagged behind the advent of multi-core and multi-thread CPUs, leading to frustrating slowdowns while the operating system waits for the hard disk drive. Virtualization magnifies this, allowing multiple operating systems running multiple applications with multiple threads to access storage all at once.</p>
<p>The key innovation in enterprise storage, redundant arrays of independent disks or RAID, was designed to overcome the limits of disk spindles. In <a href="http://www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf"  target="_blank">their seminal paper on RAID</a>, Patterson, Gibson, and Katz focus on &#8220;the I/O crisis&#8221; caused by accelerating CPU and memory performance. They suggest five methods of combining spindles (now called RAID levels) to accelerate I/O performance to meet this challenge. Many of today&#8217;s storage system developments are outgrowths of this insight, allowing many more spindles to share the I/O load or optimizing it between different drive types.</p>
<h3>The Rule of Spindles</h3>
<p>This is the rule of spindles: Adding more disk spindles is generally more effective than using faster spindles. Today&#8217;s storage systems often spread I/O across dozens of hard disk drives using concepts of stacked RAID, large sets, subdisk RAID, and wide striping.</p>
<p>Faster spindles can certainly help performance, and this is evident when one examines the varying performance of midrange storage systems. Those that rely on large, slow drives are much slower than the same systems packed with smaller, quicker drives. But the rule of spindles cannot be ignored. Systems that spread data across more spindles, regardless of the capabilities of each individual disk, are bound to be quicker than those that use fewer drives.</p>
<h3>Onward: Cache, I/O, and Smarts</h3>
<p>The horseman of spindles is harsh, but he does not rule the day. There are many ways to overcome his limits and his three brothers often come into play. These are cache, which bypasses the spindle altogether; I/O, which can constrain even the fastest combination of disk and cache; and the intelligence of the whole system, which limits or accelerates all the rest. We will examine these horsemen in the future!</p>
<p><em>I&#8217;ve been meaning to write this up for a long time. Thanks for listening and commenting!</em></p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2010/10/07/4-horsemen-cache/"  rel="bookmark" class="crp_title">The Four Horsemen of Storage System Performance: Never Enough Cache</a></li><li><a href="http://blog.fosketts.net/2010/10/27/4-horsemen-io/"  rel="bookmark" class="crp_title">The Four Horsemen of Storage System Performance: I/O As a Chain of Bottlenecks</a></li><li><a href="http://blog.fosketts.net/2011/04/27/western-digital-intellipark-feature-design-flaw/"  rel="bookmark" class="crp_title">Western Digital IntelliPark: Feature or Design Flaw?</a></li><li><a href="http://blog.fosketts.net/2008/09/14/turning-page-raid/"  rel="bookmark" class="crp_title">Turning the Page on RAID</a></li><li><a href="http://blog.fosketts.net/2009/10/19/flush-time/"  rel="bookmark" class="crp_title">Flush Time</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2010. |
<a href="http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/">The Four Horsemen of Storage System Performance: The Rule of Spindles</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/personal/" title="View all posts in Personal" rel="category tag">Personal</a>, <a href="http://blog.fosketts.net/category/everything/terabytehome/" title="View all posts in Terabyte home" rel="category tag">Terabyte home</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<series:name><![CDATA[4 Horsemen]]></series:name>
	</item>
		<item>
		<title>Turning the Page on RAID</title>
		<link>http://blog.fosketts.net/2008/09/14/turning-page-raid/</link>
		<comments>http://blog.fosketts.net/2008/09/14/turning-page-raid/#comments</comments>
		<pubDate>Sun, 14 Sep 2008 07:00:48 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[AutoRAID]]></category>
		<category><![CDATA[Compellent]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[EqualLogic]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[LUN]]></category>
		<category><![CDATA[NetApp]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[RAID 4]]></category>
		<category><![CDATA[RAID 6]]></category>
		<category><![CDATA[spindles]]></category>
		<category><![CDATA[storage virtualization]]></category>
		<category><![CDATA[Sun]]></category>
		<category><![CDATA[Sunday series]]></category>
		<category><![CDATA[Symmetrix]]></category>
		<category><![CDATA[WAFL]]></category>
		<category><![CDATA[ZFS]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/2008/09/14/turning-the-page-on-raid/</guid>
		<description><![CDATA[This is part of an ongoing series of longer articles I will be posting every Sunday as part of an experiment in offering more in-depth content. It has been the core technology behind the storage industry since day one, but the sun is setting on traditional RAID technology. After two decades of refinement and fragmentation, we are [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-left: 30px;"><em>This is part of an ongoing </em><a href="http://blog.fosketts.net/tag/Sunday-series/"  target="_self"><em>series of longer articles I will be posting every Sunday</em></a><em> as part of an experiment in offering more in-depth content.</em></p>
<p>It has been the core technology behind the storage industry since day one, but the sun is setting on traditional RAID technology. After two decades of refinement and fragmentation, we are abandoning the core concepts of disk-centric data protection as storage and servers go virtual. Next-generation storage products will feature refined and integrated capabilities based on pools of storage rather than combinations of disk drives, and we will all benefit from improved reliability and performance.</p>
<p><span id="more-613"></span></p>
<p><strong>RAID Classic</strong></p>
<p>Early storage systems were revolutionary, in physically removing storage from the CPU, in enabling sharing of storage between multiple CPUs, and especially in virtualizing disk drives using RAID. When <a rel="nofollow" href="http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf"  target="_top">Patterson, Gibson, and Katz proposed the creation of a redundant array of inexpensive disks (RAID)</a> in 1987, they specified <a rel="nofollow" href="http://en.wikipedia.org/wiki/RAID#Standard_levels"  target="_blank">five numbered “levels”</a>. Each level had its own features and benefits, but all centered on the idea that a static set of disk drives would be grouped together and presented to higher-level systems as a single drive. Storage devices, as a rule, mapped host data back to these integral disk sets, sometimes sharing a single RAID group among multiple “LUNs”, but never spreading data more broadly. Storage has remained stuck with small sets of drives ever since.</p>
<p>The core insight of the 1980s remains true: More spindles means better performance. Although additional overhead dulls the impact somewhat, the benefit of spreading data across multiple drives can be tremendous. A typical RAID set offers much better performance than the drives alone, and can handle a mechanical failure as a bonus.</p>
<p>Cracks are appearing in the RAID veneer, however. Double drive failures are much more common than one would expect, leading to the development of hot spare drives and dual-parity <a rel="nofollow" href="http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_6"  target="_blank">RAID 6</a>. If four drives perform well, then forty drives perform much better, leading to the common practice of “stacking” one RAID set on others. Caches and specialized processors were introduced to overcome the performance issues related to parity calculation.</p>
<p>But traditional RAID cannot overcome today’s most critical storage issues. As drives have become larger, the tiny chance of an <a href="http://www.cs.cmu.edu/~bianca/fast07.pdf"  target="_blank">unrecoverable media error</a> compounds, <a href="http://storagemojo.com/2007/07/19/why-arent-disk-reads-more-reliable/"  target="_blank">becoming a certainty</a>. Even dual-parity will not be able to guarantee data protection on the massive disks predicted for the near future – statistics cannot be denied. The latest disks contain so much data, without commensurate improvements in throughput, that rebuild times have skyrocketed, resulting in hours or days of reduced data protection.</p>
<p>RAID is also ill-suited to the demands of virtualized systems, where <a rel="nofollow" href="http://joergsstorageblog.blogspot.com/2008/06/vmware-and-how-it-effects-storage.html"  target="_blank">predictable I/O patterns become fragmented</a>. It cannot provide tiered storage or account for changing requirements over time. It cannot take advantage of the latest high-performance solid state storage technology. It cannot be used in cloud architectures, with massive numbers of small devices clustered together. It interferes with power-saving spin-down ideas. Most RAID implementations cannot even grow or shrink with the addition or removal of a disk. In short, traditional RAID cannot do what we now need storage to do.</p>
<p><strong>RAID is Dead</strong></p>
<p>Although most vendors still use the name, nearly every one has abandoned much of the classic RAID technology. EMC’s Symmetrix pioneered the idea of sub-disk RAID, pairing just a portion of each disk with others to reduce the impact of “hot spots”. <a rel="nofollow" href="http://docs.hp.com/en/B2355-90950/apas04.html"  target="_blank">HP’s AutoRAID</a> added the ability to dynamically move data from one RAID type to another to balance performance. And NetApp paired disk management so closely with their filesystem that they were able to use RAID 4 and the flexibility it brings.</p>
<p>Today, a new generation of devices has even evolved beyond RAID’s concept of coherent disk sets. Compellent, Dell EqualLogic, 3PAR and others focus on blocks of data, moving portions of a LUN between RAID sets, disk drive types, and even inner or outer tracks based on access patterns. With these devices, a single LUN could encompass data on every drive in the storage array. And the latest clustered arrays can spread data across multiple storage nodes to scale performance and protection.</p>
<p>These innovative devices point the way to a future in which virtual storage is serviced and protected very differently than in the past. Perhaps software like Sun’s ZFS serves to illustrate this future best: It unifies storage as a single pool, intelligently protecting it and presenting flexible storage volumes to the operating system. Although Sun calls its data protection scheme “RAID-Z”, it has little in common with its namesake. Like NetApp’s WAFL, the copy-on-write ZFS filesystem is totally integrated with the layout of data on disk, allowing mobility and efficient use of storage. A single pool can include striping, single- or dual-parity, and mirroring, and disks can be added as needed. Importantly, ZFS also checksums all reads, detecting disk errors.</p>
<p><strong>Long Live RAID</strong></p>
<p>The post-RAID future will see these concepts spread across all enterprise storage devices. Disks will be pooled rather than segregated into RAID sets. Tight integration between layout and data protection will allow for much greater flexibility, integrating tiering and differing data protection strategies in a unified whole. Storage virtualization will allow mobility of data within these future storage arrays, and clustering will enable massive scalability.</p>
<p>Two things will likely remain to remind us of Patterson, Gibson, and Katz, however. First, the core principle that multiple drives working as one yields dividends in terms of performance and data protection. And second, that whatever we use should be called RAID, even though the definition of that term has changed beyond recognition in the last two decades.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2009/08/14/2-tb-enterprise-drives/"  rel="bookmark" class="crp_title">2 TB Enterprise Drives Are Here?</a></li><li><a href="http://blog.fosketts.net/2010/08/11/320-gb-hard-disk-drive-reliability/"  rel="bookmark" class="crp_title">Are 320 GB Drives Doomed?</a></li><li><a href="http://blog.fosketts.net/2010/08/25/4-horsemen-spindles/"  rel="bookmark" class="crp_title">The Four Horsemen of Storage System Performance: The Rule of Spindles</a></li><li><a href="http://blog.fosketts.net/2011/04/20/lacie-big-disk-thunderbolt-preview/"  rel="bookmark" class="crp_title">LaCie Little Big Disk Thunderbolt Preview</a></li><li><a href="http://blog.fosketts.net/2007/08/13/garth-gibson-still-relevant-after-all-these-years/"  rel="bookmark" class="crp_title">Garth Gibson: Still Relevant After All These Years</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2008/09/14/turning-page-raid/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2008. |
<a href="http://blog.fosketts.net/2008/09/14/turning-page-raid/">Turning the Page on RAID</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2008/09/14/turning-page-raid/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

