<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>Stephen Foskett, Pack Rat &#187; Metadata Archives  &#8211; Stephen Foskett, Pack Rat</title>
	<atom:link href="http://blog.fosketts.net/tag/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fosketts.net</link>
	<description>Understanding the accumulation of data</description>
	<lastBuildDate>Fri, 10 Feb 2012 17:40:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com" />
	<atom:link rel="hub" href="http://superfeedr.com/hubbub" />
			<item>
		<title>We Need a Storage Revolution</title>
		<link>http://blog.fosketts.net/2011/04/30/storage-revolution/</link>
		<comments>http://blog.fosketts.net/2011/04/30/storage-revolution/#comments</comments>
		<pubDate>Sat, 30 Apr 2011 16:00:00 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Apple]]></category>
		<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[BigTable]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[data management]]></category>
		<category><![CDATA[Fibre Channel]]></category>
		<category><![CDATA[Files-11]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HFS]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Microsoft Office]]></category>
		<category><![CDATA[NAS]]></category>
		<category><![CDATA[nas storage]]></category>
		<category><![CDATA[network attached storage]]></category>
		<category><![CDATA[network storage]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[SCSI]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[storage management]]></category>
		<category><![CDATA[Sunday series]]></category>
		<category><![CDATA[updated]]></category>
		<category><![CDATA[VMS]]></category>
		<category><![CDATA[volume manager]]></category>
		<category><![CDATA[XAM]]></category>
		<category><![CDATA[ZFS]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/2008/10/26/we-need-a-storage-revolution/</guid>
		<description><![CDATA[Storage protocols continue to mimic direct attached storage, with the concepts of block and file at its core. No amount of virtualization, and no new protocol, will fix this - we need a storage revolution.]]></description>
			<content:encoded><![CDATA[<div id="attachment_789" class="wp-caption aligncenter" style="width: 226px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://blog.fosketts.net/wp-content/uploads/2008/09/revolution-array.png" ><img class="size-medium wp-image-789 " title="Revolution Array" src="http://blog.fosketts.net/wp-content/uploads/2008/09/revolution-array-216x300.png" alt="" width="216" height="300" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">I think this sentiment is just as valid today as when I posted it in 2008!</p></div>
<p>Although many discussions in the storage industry focus on the relative merits of one protocol or another, the conversation occasionally turns to the core issue at hand: We continue to patch together a system based on outdated concepts. Most storage protocols continue to mimic direct attached storage, and most of our so-called networks act as point to point channels. An ultra-modern virtualized storage infrastructure with all the latest bells and whistles still holds the concepts of block and file at its core. Whenever the storage industry has tried to bring about real storage management they have been stymied by a lack of context for data.</p>
<p>No amount of virtualization, and no new protocol, will fix this. Put simply, we need a storage revolution.</p>
<h3>Channels, Blocks, and Files</h3>
<p>Most innovation in the 1980s and early 1990s focused on moving storage out of the server. <a rel="nofollow" href="http://en.wikipedia.org/wiki/SCSI"  target="_blank">SCSI</a> allowed disk to exist in a separate cabinet, <a rel="nofollow" href="http://en.wikipedia.org/wiki/RAID"  target="_blank">RAID</a> allowed multiple physical disks to become a single virtual one, and these were mixed to become the prototype storage array. Although SCSI allowed one-to-many connectivity, it was never a true peer-to-peer network, even once it was mixed with network concepts in the form of <a rel="nofollow" href="http://en.wikipedia.org/wiki/Fibre_Channel"  target="_blank">Fibre Channel</a>.</p>
<p>Even today, SAN storage is focused on providing faster, more flexible, and feature-packed direct-attached storage. A modern virtual SAN hides a complex arrangement of caching, data protection, tiered storage, replication, and deduplication, masquerading the lot as a simple, lowly disk drive. It is sad but true that all of our work as an industry has been dedicated to recreating what we started with.</p>
<p>Networked file-based storage is no better. Although NAS devices have all the advanced features of their SAN cousins, they must present a simple file tree to the host to retain compatibility. File virtualization merely presents a larger homogenous tree.</p>
<p>Inside the server, too, features and complexity are hidden to retain a familiar file system format. Volume managers can do anything a virtualization device can, but must present their output as a simple (though virtual) disk drive. File systems, too, have added features but still present a familiar tree of mount points, inodes, and files. Even ZFS, possibly <a href="http://blog.fosketts.net/2008/02/27/zfs-super-file-system/"  target="_self">the most advanced</a> combination of volume management and file system technology yet, must present a simple tree of storage to applications.</p>
<h3>The Metadata Roadblock</h3>
<p>This outdated paradigm, of disks and file trees, is ill-suited to today&#8217;s storage challenges. Data must be categorized so actions can be taken to preserve or destroy it based on policies. Data must be searchable so users and applications can find what they want. Data must be flexible so it can be used in new ways. Our antiquated notions are not capable of meeting these challenges.</p>
<p>One simple problem is that we lack context for our data. Most file systems merely assign to a file a name, location, owner, and security attributes. The most advanced can contain extended metadata, but this is rarely seen in practice since many applications cannot agree on how to use this data. Microsoft&#8217;s Office suite can store and share extended file attributes, for example, but these live inside the file rather than in the file system. The promise of expanded Office attributes is only realized in conjunction with a content management system like SharePoint which lies above the lowly file system.</p>
<p>What if the storage system could keep this data instead? What if it could logically group files according to project or client, mining keywords and authors, and maintaining revisions? These concepts are not new, having been implemented in content management systems for years, and certain elements appeared in file systems, like <a rel="nofollow" href="http://en.wikipedia.org/wiki/Hierarchical_File_System"  target="_blank">Apple&#8217;s HFS</a> and <a rel="nofollow" href="http://en.wikipedia.org/wiki/Files-11"  target="_blank">VMS&#8217; Files-11</a>, for decades.</p>
<h3>Cut Down the Tree</h3>
<p>File metadata would allow advanced features, but truly taking advantage of them requires a more fundamental shift in the way applications access files. Rather than sticking to a traditional hierarchy of directories in a tree (which was, after all, simply a primitive metadata system), we should remove the tree altogether. Allow files to become data objects, identified by arbitrary attributes and managed according to an overarching policy.</p>
<p>This future vision is decidedly different from our current notion of storage, but is not so far off. Many organizations now rely on central data warehouses based on SQL-language relational databases. As many storage managers have grumbled, databases tend to ignore storage management concepts entirely, managing their own content independently.</p>
<p>But not all applications need a database back-end, so another initiative seeks to provide generic object storage for wider use. Called content-addressable storage or <a rel="nofollow" href="http://en.wikipedia.org/wiki/Content-addressable_storage"  target="_blank">CAS</a>, these devices have traditionally been used only for archival purposes, since that was their first market application. As vendors break free of proprietary interfaces in favor of open ones like XAM, CAS could transform storage itself by eliminating both file and block storage at once.</p>
<p>Similar concepts are already at work in the so-called Web 2.0 world. Non-traditional databases like Google BigTable, Amazon S3, and Hadoop allow massive scalability for object storage. API-sharing initiatives with many Web 2.0 companies can be seen as similar prototypical object storage frameworks. Any of these could be leveraged to provide a new world of data storage, and many are gaining traction even now.</p>
<h3>Stephen&#8217;s Stance</h3>
<p>Although traditional block storage is here to stay for disk drives, and tree-type file systems are likely to remain the foundation of operating system storage, new object-based concepts could change the world in fundamental ways. As applications become &#8220;web aware&#8221;, they also become object aware, increasing the likelihood of such a storage revolution. For the majority of applications, this new world would be a welcome one indeed.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2011/04/19/granularity-challenge-storage-management/"  rel="bookmark" class="crp_title">Granularity: The Hidden Challenge of Storage Management</a></li><li><a href="http://blog.fosketts.net/2007/06/25/storage-history-the-3server/"  rel="bookmark" class="crp_title">Storage History: The 3Server</a></li><li><a href="http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/"  rel="bookmark" class="crp_title">From CAS to Cloud: Revolutionary Storage</a></li><li><a href="http://blog.fosketts.net/2008/09/15/greenbytes-embraces-extends-zfs/"  rel="bookmark" class="crp_title">greenBytes Embraces and Extends ZFS</a></li><li><a href="http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/"  rel="bookmark" class="crp_title">Deduplication Coming to Primary Storage</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2011/04/30/storage-revolution/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2011. |
<a href="http://blog.fosketts.net/2011/04/30/storage-revolution/">We Need a Storage Revolution</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/apple/" title="View all posts in Apple" rel="category tag">Apple</a>, <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2011/04/30/storage-revolution/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Monitoring Filesystem Metadata For Thin Provisioning</title>
		<link>http://blog.fosketts.net/2011/01/03/monitoring-filesystem-metadata-thin-provisioning/</link>
		<comments>http://blog.fosketts.net/2011/01/03/monitoring-filesystem-metadata-thin-provisioning/#comments</comments>
		<pubDate>Mon, 03 Jan 2011 17:58:26 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Everything]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[data robotics]]></category>
		<category><![CDATA[Drobo]]></category>
		<category><![CDATA[FAT]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[HFS]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[metadata monitoring]]></category>
		<category><![CDATA[NTFS]]></category>
		<category><![CDATA[thin provisioning]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=4628</guid>
		<description><![CDATA[I began by introducing the core problem: Storage isn't getting any cheaper due to storage utilization and provisioning problems. Thin provisioning isn't all it's cracked up to be, since the telephone game makes de-allocation a challenge. So now let's talk about how to make thin provisioning actually work.]]></description>
			<content:encoded><![CDATA[<p><a href="http://static.fosketts.net/wp-content/uploads/2010/12/Slide01.jpg"><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-medium wp-image-4606" title="Slide01" src="http://static.fosketts.net/wp-content/uploads/2010/12/Slide01-300x225.jpg" alt="" width="300" height="225" /></a>

One of the topics I've often written and spoken about is thin provisioning. This series of 11 articles is an edited version of <a href="http://www.slideshare.net/sfoskett/state-of-the-art-thin-provisioning" target="_blank">my thin provisioning presentation from Interop New York 2010</a>. I hope you enjoy it!</p>
<p>I began by introducing the core problem: <a href="http://blog.fosketts.net/2010/12/27/thin-provisioning-storage-cheaper/"  target="_blank">Storage isn&#8217;t getting any cheaper</a> due to <a href="http://blog.fosketts.net/2010/12/27/thin-provisioning-attacking-storage-utilization/"  target="_blank">storage utilization and provisioning problems</a>. Thin provisioning isn&#8217;t all it&#8217;s cracked up to be, since <a href="http://blog.fosketts.net/2010/12/30/thin-provisioning-playing-telephone-game/" >the telephone game</a> makes <a href="http://blog.fosketts.net/2010/12/29/deallocating-core-issue-thin-provisioning/" >de-allocation a challenge</a>. So now let&#8217;s talk about how to make thin provisioning actually work.</p>
<p><a href="http://static.fosketts.net/wp-content/uploads/2010/12/Slide11.jpg" ><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-medium wp-image-4596" title="Slide11" src="http://static.fosketts.net/wp-content/uploads/2010/12/Slide11-300x225.jpg" alt="" width="300" height="225" /></a></p>
<p>There are 100 different ways of solving the de-allocation problem, some of which have gained some prominence. They all boil down to two options:</p>
<ol>
<li>Make the <strong>server</strong> super-smart and have it communicate better</li>
<li>Make the <strong>storage</strong> super-smart and have it make educated guesses</li>
</ol>
<p>There&#8217;s only a few ways that the server-side option can be implemented, and we&#8217;ll get to that. But first, let&#8217;s take a look at a sort of hybrid approach that relies on known server usage patterns: Metadata monitoring.</p>
<p><a href="http://static.fosketts.net/wp-content/uploads/2010/12/Slide12.jpg" ><img style=' display: block; margin-right: auto; margin-left: auto;'  class="aligncenter size-medium wp-image-4595" title="Slide12" src="http://static.fosketts.net/wp-content/uploads/2010/12/Slide12-300x225.jpg" alt="" width="300" height="225" /></a></p>
<p>It&#8217;s really hard for the storage to really understand what the server is doing. The best example that I know of is <a href="http://blog.fosketts.net/series/drobo/"  target="_blank">the Drobo sitting under my desk</a>.</p>
<p>I love this little black box. When I got it, I configured it as eight terabytes and I put a 160-gig disk in it. That&#8217;s thin provisioning. And over time, I&#8217;m swapping out the disks and I&#8217;m doing all my stuff, and it still looks like eight terabytes. Add data, delete it, swap disks, and it always just works.</p>
<p>Not a lot of people know how the Drobo works, though. One of the things that people have complained about is that it only supports certain file systems and partition schemes. The reason for this is a &#8220;magical&#8221; thing it&#8217;s doing that relates very, very closely to the topic of this discussion. The Drobo is the first thin provisioning box that I know of that directly monitors the file system.</p>
<p>What the Drobo does is this: It knows where the supported filesystems (HFS+, NTFS, EXT3, and FAT) keep the record of what&#8217;s been deleted. So the Drobo it watches that spot and when you delete something, it reclaims that space. No enterprise storage system can do this, and yet this little box under my desk does it all day long.</p>
<p>This is basically the super, ultimate smarts of storage. But, of course, it&#8217;s very limited. It faces a real challenge in an enterprise setting because there is much more variety. We have all these layers of virtualization and weird file systems and things like that to worry about. We just can&#8217;t expect a product like this to accommodate everybody, so we just can&#8217;t expect this kind of smarts to be put everywhere.</p>
<p>Instead, we have a variety of semaphores sent from the server to the storage array that attempt to solve the telephone game. That&#8217;s what we&#8217;re talking about next.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2010/12/30/thin-provisioning-playing-telephone-game/"  rel="bookmark" class="crp_title">Thin Provisioning: Playing the Telephone Game</a></li><li><a href="http://blog.fosketts.net/2010/12/29/deallocating-core-issue-thin-provisioning/"  rel="bookmark" class="crp_title">De-Allocating is the Core Issue for Thin Provisioning</a></li><li><a href="http://blog.fosketts.net/2011/01/04/page-reclaim-savior-thin-provisioning/"  rel="bookmark" class="crp_title">Zero Page Reclaim: Savior of Thin Provisioning?</a></li><li><a href="http://blog.fosketts.net/2011/01/06/bridge-veritas-thin-provisioning-api/"  rel="bookmark" class="crp_title">The Bridge: Veritas Thin (Provisioning) API</a></li><li><a href="http://blog.fosketts.net/2010/12/28/thin-provisioning-attacking-storage-utilization/"  rel="bookmark" class="crp_title">Thin Provisioning: Attacking Storage Utilization</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2011/01/03/monitoring-filesystem-metadata-thin-provisioning/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2011. |
<a href="http://blog.fosketts.net/2011/01/03/monitoring-filesystem-metadata-thin-provisioning/">Monitoring Filesystem Metadata For Thin Provisioning</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/" title="View all posts in Everything" rel="category tag">Everything</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2011/01/03/monitoring-filesystem-metadata-thin-provisioning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[State of the Art Thin Provisioning]]></series:name>
	</item>
	</channel>
</rss>

