<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>Stephen Foskett, Pack Rat &#187; CAS Archives  &#8211; Stephen Foskett, Pack Rat</title>
	<atom:link href="http://blog.fosketts.net/tag/cas/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fosketts.net</link>
	<description>Understanding the accumulation of data</description>
	<lastBuildDate>Fri, 10 Feb 2012 17:40:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com" />
	<atom:link rel="hub" href="http://superfeedr.com/hubbub" />
			<item>
		<title>We Need a Storage Revolution</title>
		<link>http://blog.fosketts.net/2011/04/30/storage-revolution/</link>
		<comments>http://blog.fosketts.net/2011/04/30/storage-revolution/#comments</comments>
		<pubDate>Sat, 30 Apr 2011 16:00:00 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Apple]]></category>
		<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[BigTable]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[data management]]></category>
		<category><![CDATA[Fibre Channel]]></category>
		<category><![CDATA[Files-11]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HFS]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Microsoft Office]]></category>
		<category><![CDATA[NAS]]></category>
		<category><![CDATA[nas storage]]></category>
		<category><![CDATA[network attached storage]]></category>
		<category><![CDATA[network storage]]></category>
		<category><![CDATA[RAID]]></category>
		<category><![CDATA[SCSI]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[storage management]]></category>
		<category><![CDATA[Sunday series]]></category>
		<category><![CDATA[updated]]></category>
		<category><![CDATA[VMS]]></category>
		<category><![CDATA[volume manager]]></category>
		<category><![CDATA[XAM]]></category>
		<category><![CDATA[ZFS]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/2008/10/26/we-need-a-storage-revolution/</guid>
		<description><![CDATA[Storage protocols continue to mimic direct attached storage, with the concepts of block and file at its core. No amount of virtualization, and no new protocol, will fix this - we need a storage revolution.]]></description>
			<content:encoded><![CDATA[<div id="attachment_789" class="wp-caption aligncenter" style="width: 226px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://blog.fosketts.net/wp-content/uploads/2008/09/revolution-array.png" ><img class="size-medium wp-image-789 " title="Revolution Array" src="http://blog.fosketts.net/wp-content/uploads/2008/09/revolution-array-216x300.png" alt="" width="216" height="300" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">I think this sentiment is just as valid today as when I posted it in 2008!</p></div>
<p>Although many discussions in the storage industry focus on the relative merits of one protocol or another, the conversation occasionally turns to the core issue at hand: We continue to patch together a system based on outdated concepts. Most storage protocols continue to mimic direct attached storage, and most of our so-called networks act as point to point channels. An ultra-modern virtualized storage infrastructure with all the latest bells and whistles still holds the concepts of block and file at its core. Whenever the storage industry has tried to bring about real storage management they have been stymied by a lack of context for data.</p>
<p>No amount of virtualization, and no new protocol, will fix this. Put simply, we need a storage revolution.</p>
<h3>Channels, Blocks, and Files</h3>
<p>Most innovation in the 1980s and early 1990s focused on moving storage out of the server. <a rel="nofollow" href="http://en.wikipedia.org/wiki/SCSI"  target="_blank">SCSI</a> allowed disk to exist in a separate cabinet, <a rel="nofollow" href="http://en.wikipedia.org/wiki/RAID"  target="_blank">RAID</a> allowed multiple physical disks to become a single virtual one, and these were mixed to become the prototype storage array. Although SCSI allowed one-to-many connectivity, it was never a true peer-to-peer network, even once it was mixed with network concepts in the form of <a rel="nofollow" href="http://en.wikipedia.org/wiki/Fibre_Channel"  target="_blank">Fibre Channel</a>.</p>
<p>Even today, SAN storage is focused on providing faster, more flexible, and feature-packed direct-attached storage. A modern virtual SAN hides a complex arrangement of caching, data protection, tiered storage, replication, and deduplication, masquerading the lot as a simple, lowly disk drive. It is sad but true that all of our work as an industry has been dedicated to recreating what we started with.</p>
<p>Networked file-based storage is no better. Although NAS devices have all the advanced features of their SAN cousins, they must present a simple file tree to the host to retain compatibility. File virtualization merely presents a larger homogenous tree.</p>
<p>Inside the server, too, features and complexity are hidden to retain a familiar file system format. Volume managers can do anything a virtualization device can, but must present their output as a simple (though virtual) disk drive. File systems, too, have added features but still present a familiar tree of mount points, inodes, and files. Even ZFS, possibly <a href="http://blog.fosketts.net/2008/02/27/zfs-super-file-system/"  target="_self">the most advanced</a> combination of volume management and file system technology yet, must present a simple tree of storage to applications.</p>
<h3>The Metadata Roadblock</h3>
<p>This outdated paradigm, of disks and file trees, is ill-suited to today&#8217;s storage challenges. Data must be categorized so actions can be taken to preserve or destroy it based on policies. Data must be searchable so users and applications can find what they want. Data must be flexible so it can be used in new ways. Our antiquated notions are not capable of meeting these challenges.</p>
<p>One simple problem is that we lack context for our data. Most file systems merely assign to a file a name, location, owner, and security attributes. The most advanced can contain extended metadata, but this is rarely seen in practice since many applications cannot agree on how to use this data. Microsoft&#8217;s Office suite can store and share extended file attributes, for example, but these live inside the file rather than in the file system. The promise of expanded Office attributes is only realized in conjunction with a content management system like SharePoint which lies above the lowly file system.</p>
<p>What if the storage system could keep this data instead? What if it could logically group files according to project or client, mining keywords and authors, and maintaining revisions? These concepts are not new, having been implemented in content management systems for years, and certain elements appeared in file systems, like <a rel="nofollow" href="http://en.wikipedia.org/wiki/Hierarchical_File_System"  target="_blank">Apple&#8217;s HFS</a> and <a rel="nofollow" href="http://en.wikipedia.org/wiki/Files-11"  target="_blank">VMS&#8217; Files-11</a>, for decades.</p>
<h3>Cut Down the Tree</h3>
<p>File metadata would allow advanced features, but truly taking advantage of them requires a more fundamental shift in the way applications access files. Rather than sticking to a traditional hierarchy of directories in a tree (which was, after all, simply a primitive metadata system), we should remove the tree altogether. Allow files to become data objects, identified by arbitrary attributes and managed according to an overarching policy.</p>
<p>This future vision is decidedly different from our current notion of storage, but is not so far off. Many organizations now rely on central data warehouses based on SQL-language relational databases. As many storage managers have grumbled, databases tend to ignore storage management concepts entirely, managing their own content independently.</p>
<p>But not all applications need a database back-end, so another initiative seeks to provide generic object storage for wider use. Called content-addressable storage or <a rel="nofollow" href="http://en.wikipedia.org/wiki/Content-addressable_storage"  target="_blank">CAS</a>, these devices have traditionally been used only for archival purposes, since that was their first market application. As vendors break free of proprietary interfaces in favor of open ones like XAM, CAS could transform storage itself by eliminating both file and block storage at once.</p>
<p>Similar concepts are already at work in the so-called Web 2.0 world. Non-traditional databases like Google BigTable, Amazon S3, and Hadoop allow massive scalability for object storage. API-sharing initiatives with many Web 2.0 companies can be seen as similar prototypical object storage frameworks. Any of these could be leveraged to provide a new world of data storage, and many are gaining traction even now.</p>
<h3>Stephen&#8217;s Stance</h3>
<p>Although traditional block storage is here to stay for disk drives, and tree-type file systems are likely to remain the foundation of operating system storage, new object-based concepts could change the world in fundamental ways. As applications become &#8220;web aware&#8221;, they also become object aware, increasing the likelihood of such a storage revolution. For the majority of applications, this new world would be a welcome one indeed.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2011/04/19/granularity-challenge-storage-management/"  rel="bookmark" class="crp_title">Granularity: The Hidden Challenge of Storage Management</a></li><li><a href="http://blog.fosketts.net/2007/06/25/storage-history-the-3server/"  rel="bookmark" class="crp_title">Storage History: The 3Server</a></li><li><a href="http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/"  rel="bookmark" class="crp_title">From CAS to Cloud: Revolutionary Storage</a></li><li><a href="http://blog.fosketts.net/2008/09/15/greenbytes-embraces-extends-zfs/"  rel="bookmark" class="crp_title">greenBytes Embraces and Extends ZFS</a></li><li><a href="http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/"  rel="bookmark" class="crp_title">Deduplication Coming to Primary Storage</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2011/04/30/storage-revolution/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2011. |
<a href="http://blog.fosketts.net/2011/04/30/storage-revolution/">We Need a Storage Revolution</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/apple/" title="View all posts in Apple" rel="category tag">Apple</a>, <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2011/04/30/storage-revolution/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Caringo Bulks Up CAStor For Cloud Services</title>
		<link>http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/</link>
		<comments>http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 15:02:37 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Everything]]></category>
		<category><![CDATA[Gestalt IT]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[Caringo]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[CAStor]]></category>
		<category><![CDATA[Centera]]></category>
		<category><![CDATA[cloud storage]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[Dell DX]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[multi-tenancy]]></category>
		<category><![CDATA[Paul Carpentier]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=3956</guid>
		<description><![CDATA[Now that the hype of "cloud everything" is subsiding, organizations are getting down to work deploying cloud storage to do actual useful tasks. The march from CAS to cloud to object storage has seen high-profile high-end flare-ups (think EMC Centera and Atmos) but the bulk of work is done by more pedestrian (think lower-cost) hardware and software. Through it all, Paul Carpentier has been at the forefront. Now his company, Caringo, is back in the news, delivering much-needed storage service features like multi-tenancy, named objects, dynamic caching, and web services.]]></description>
			<content:encoded><![CDATA[<div id="attachment_3957" class="wp-caption aligncenter" style="width: 190px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://static.fosketts.net/wp-content/uploads/2010/10/logo_caringo.png" ><img class="size-full wp-image-3957" title="logo_caringo" src="http://static.fosketts.net/wp-content/uploads/2010/10/logo_caringo.png" alt="" width="180" height="58" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">You may not know Caringo, but you have probably heard of cloud storage, EMC Centera, and Dell DX. Read on to learn the link!</p></div>
<p>Now that the hype of &#8220;cloud everything&#8221; is subsiding, organizations are getting down to work deploying cloud storage to do actual useful tasks. The march from CAS to cloud to object storage has seen high-profile high-end flare-ups (think EMC Centera and Atmos) but the bulk of work is done by more pedestrian (think lower-cost) hardware and software. Through it all, Paul Carpentier has been at the forefront. Now his company, <a href="http://caringo.com/"  target="_blank">Caringo</a>, is back in the news, <a href="http://caringo.com/news/caringo_extends_lead_in_cloud_storage.html"  target="_blank">delivering</a> much-needed storage service features like multi-tenancy, named objects, and dynamic caching.</p>
<blockquote><p>For essential background, check out my article, <a href="http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/" >From CAS to Cloud: Revolutionary Storage</a></p></blockquote>
<h3>The Back-Story of Caringo</h3>
<p>The &#8220;Caringo&#8221; company name refers to its three founders, CTO Paul Carpentier, President Jonathan Ring, and CEO Mark Goros. Carpentier is the man behind CAS pioneer FilePool, which EMC acquired and markets as Centera. The three formed Caringo and launched the CAStor product in 2006 as a software alternative to Centera.</p>
<p>Although you may not have heard of Caringo, you may have encountered their product in the form of the <a rel="nofollow" href="http://www.dell.com/us/en/enterprise/storage/dell-dx/pd.aspx?refid=dell-dx&amp;cs=555&amp;s=biz"  target="_blank">Dell DX object storage system</a>. Many were puzzled when Dell, known for its EMC-powered storage offerings, embraced Caringo for object storage, but those familiar with the products weren&#8217;t surprised. Caringo&#8217;s approach is much more in line with Dell&#8217;s image of affordability, simplicity, and commodity products, and their relationship with EMC is <a href="http://www.theregister.co.uk/2010/10/24/dell_emc/"  target="_blank">increasingly shaky</a> due to <a href="http://blog.fosketts.net/2010/08/16/dell-3par-enterprise-storage/"  target="_blank">their recent acquisition strategy</a>.</p>
<p>Caringo&#8217;s CAStor is a software product that transforms commodity servers into a scale-out object repository. It is all-inclusive, with compliance, tiering, spin-down, and replication part of the total package. Like most CAS and cloud storage solutions, CAStor uses a simple HTTP interface for client access, with &#8220;gateways&#8221; available for NAS along with some native support from applications.</p>
<h3>What&#8217;s New in CAStor 5?</h3>
<p>Caringo has set a course for the service provider market, adding essential features like multi-tenancy and flexible permissions to version 5 of CAStor. Although still pitched as an object store, CAStor 5 is close enough to be thought of as a cloud storage platform.</p>
<p>The ability to support and segregate multiple &#8220;tenants&#8221; is a holy grail for service provider storage systems and a key ingredient of cloud storage solutions. CAStor 5 can be segmented into multiple domains, each with its own security and authentication and each subdivided into &#8220;buckets&#8221; for different applications. This would be useful both for a public service provider and an internal-only solution, since segmenting applications is relevant in the enterprise as well.</p>
<p>CAStor 5 no longer clings to system-assigned names for objects, allowing users to assign their own names for public consumption. This is a huge advancement for CAS, and was one of the key differentiators of cloud solutions which often directly serve content to web clients. Another &#8220;ripped from the cloud&#8221; feature is dynamic caching, allowing high performance access to popular content, again useful for direct client access.</p>
<h3>Stephen&#8217;s Stance</h3>
<p>Caringo seems reluctant to wear the &#8220;cloud storage&#8221; mantle, but their product has been steadily moving in that direction. CAStor 5, with its multi-tenancy, segmented security and authentication, named objects, and caching, looks an awful lot like Amazon S3 and the rest. But the hype around &#8220;cloud storage&#8221; is dying away. Businesses looking for functionality rather than marketing labels will find a lot to like in CAStor.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/"  rel="bookmark" class="crp_title">From CAS to Cloud: Revolutionary Storage</a></li><li><a href="http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/"  rel="bookmark" class="crp_title">EMC Atmos Versus VMware VDC-OS: Will The Real Cloud Strategy Please Stand Up?</a></li><li><a href="http://blog.fosketts.net/2009/09/22/zend-simple-cloud-api/"  rel="bookmark" class="crp_title">Zend Simple Cloud API = Freedom!</a></li><li><a href="http://blog.fosketts.net/2009/03/19/sun-cloud/"  rel="bookmark" class="crp_title">Sun Launches Their Own Cloud, But For Which Market?</a></li><li><a href="http://blog.fosketts.net/2009/04/19/cloud-slam-topic-enterprise-storage-predictable/"  rel="bookmark" class="crp_title">My Cloud Slam Topic: Enterprise Storage (Predictable?)</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2010. |
<a href="http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/">Caringo Bulks Up CAStor For Cloud Services</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/" title="View all posts in Everything" rel="category tag">Everything</a>, <a href="http://blog.fosketts.net/category/gestaltit/" title="View all posts in Gestalt IT" rel="category tag">Gestalt IT</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>From CAS to Cloud: Revolutionary Storage</title>
		<link>http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/</link>
		<comments>http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 14:27:38 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Gestalt IT]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[Amazon S3]]></category>
		<category><![CDATA[Asigra]]></category>
		<category><![CDATA[Atmos]]></category>
		<category><![CDATA[Caringo]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[Centera]]></category>
		<category><![CDATA[Cirtas]]></category>
		<category><![CDATA[cloud storage]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[MaxiScale]]></category>
		<category><![CDATA[Mezeo]]></category>
		<category><![CDATA[Nasuni]]></category>
		<category><![CDATA[Nirvanix]]></category>
		<category><![CDATA[Paul Carpentier]]></category>
		<category><![CDATA[S3]]></category>
		<category><![CDATA[Seven10]]></category>
		<category><![CDATA[StorageNetworks]]></category>
		<category><![CDATA[StorSimple]]></category>
		<category><![CDATA[Twin Strata]]></category>
		<category><![CDATA[XAM]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=3959</guid>
		<description><![CDATA[Change is not a word normally associated with storage, and revolution is practically unheard of. Today's modern enterprise storage systems and networks employ massive resources to do one simple thing: Emulate the basic hard disk drives used over three decades ago. But cracks are appearing in our mausoleum of fake disks: Application developers are discovering the value of object storage, and storage systems are appearing to support this need.]]></description>
			<content:encoded><![CDATA[<div id="attachment_3961" class="wp-caption aligncenter" style="width: 310px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://static.fosketts.net/wp-content/uploads/2010/10/22793093_634de61ca7_z.jpg" ><img class="size-medium wp-image-3961" title="22793093_634de61ca7_z" src="http://static.fosketts.net/wp-content/uploads/2010/10/22793093_634de61ca7_z-300x225.jpg" alt="" width="300" height="225" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">We need to move beyond fake disks and deploy application-centric storage</p></div>
<p>Change is not a word normally associated with storage, and revolution is practically unheard of. Today&#8217;s modern enterprise storage systems and networks employ massive resources to do one simple thing: Emulate the basic hard disk drives used over three decades ago. But cracks are appearing in our mausoleum of fake disks: Application developers are discovering the value of object storage, and storage systems are appearing to support this need.</p>
<blockquote><p>I also wrote about this two years ago, proclaiming that <a href="http://blog.fosketts.net/2008/09/28/we-need-storage-revolution/" >We Need a Storage Revolution</a> and forecasting <a href="http://blog.fosketts.net/2010/04/29/techie-business-schism/" >The Techie/Business Schism</a></p></blockquote>
<h3>The CAS Revolution</h3>
<p><a href="http://www.caringo.com/index.html"  target="_blank">Caringo</a> founder and CTO, Paul Carpentier, rose to prominence around 2000 at FilePool, one of the prime movers in the content-addressable storage (CAS) space. I recall a light going off in my head as Paul introduced me to FilePool&#8217;s CAS technology back then, imagining the possibilities of the concept. Files would be stored based on &#8220;what they were&#8221; rather than &#8220;where they were&#8221; and could be organized according to application needs rather than the conventional &#8220;extent of blocks&#8221; or tree heirarchy.</p>
<p>CAS discarded decades of filesystem and block storage baggage, introducing a new method for storing and retrieving data that better-matched the burgeoning web and enterprise applications of today. I had seen the failure of the first wave of storage service providers from inside StorageNetworks, and it was this desire for a real storage revolution that led me to dive into cloud storage at Nirvanix almost a decade later. Although I am now on my own, I remain convinced that the future belongs to storage systems that look nothing like today&#8217;s SAN and NAS.</p>
<p>Shortly after that 2001 meeting, EMC acquired FilePool and launched it as the Centera product line. But CAS systems quickly ran into a serious roadblock: Conventional applications cannot read and write to unconventional storage systems like Centera. EMC pushed key software vendors (especially in the archiving space) to create special Centera interfaces, and the industry bogged down developing the XAM standard. Other companies, like <a href="http://www.seventenstorage.com/"  target="_blank">Seven Ten Storage Software</a>, jumped in to help with the translation from proprietary CAS interfaces, but the transition from legacy files and blocks to object storage has been long and slow.</p>
<h3>Cloud Storage: Another Dimension</h3>
<p>Meanwhile, in an alternate dimension, web developers realized they had a serious problem. They were developing applications that scaled massively, spanning servers and exhausting conventional filesystems. Conventional systems just wouldn&#8217;t cut the mustard.</p>
<p>Since they were soaking in web applications, these developers applied the lessons of web services to storage: Why not just make an HTTP connection and ask for an object by a unique ID rather than walk a filesystem tree? Why not encapsulate the &#8220;state&#8221; of this request in the request itself rather than make a lasting connection and association between the client and server?</p>
<p>Thus was born cloud storage, and it was bookseller Amazon who opened the floodgates with their 2006 introduction of a &#8220;Simple Storage Service&#8221; or S3. They allowed anyone to store and retrieve objects from their massive web services infrastructure. S3 and similar services from Rackspace, Nirvanix, and others, are special-purpose web servers, and their simple interfaces are wonderfully attractive to web developers. For example, this WordPress-based blog uses cloud storage to serve images to your browser!</p>
<h3>Similarities in CAS and Cloud</h3>
<p>Although developed from vastly-differing starting points, CAS and cloud storage are essentially similar: Both reject conventional blocks and files in favor of object storage; both organize data with metadata databases; both multiply and scale out. There is one other major similarity between CAS and cloud storage: Both are attractive to service providers.</p>
<p>Imagine you operate a business that stores data for customers. You would want a flexible infrastructure that would scale with demand and segment each &#8220;tenant&#8221; from others for security and performance. As we learned at StorageNetworks, conventional SAN and NAS systems just weren&#8217;t meant to work in this kind of environment. Whether operating an internal service or a public cloud, service providers require something entirely different.</p>
<p>Cloud storage was designed from the start with service providers in mind, embedding per-object and per-&#8221;bucket&#8221; security, scalability, and abstraction between hardware and clients. Although quite complex to design, cloud storage is amazingly simple to use, provided an application can interface with it.</p>
<p>CAS wasn&#8217;t designed like this. Systems like EMC&#8217;s Centera were created for the needs of applications like enterprise archiving, but secure storage of content and extreme scalability are critical here as well. But early CAS systems didn&#8217;t need simple web-style interfaces or extreme hardware abstraction. These were enterprise systems, after all.</p>
<h3>The CAS/Cloud Colission</h3>
<p>CAS wasn&#8217;t exactly successful. Although object storage found a niche in enterprise archiving, the enterprise storage world has mostly continued with blocks and files. The major storage vendors all have some kind of object storage, but most are repurposed NAS rather than dedicated CAS like the Centera.</p>
<p>Although much skepticism has been raised about cloud storage in the enterprise, its impact on application development cannot be denied. Indeed, the majority of developers are now focused on programming platforms that abstract both compute and storage from conventional operating systems. The next generation of applications will run in &#8220;platform as a service&#8221; environments first, and cloud storage is a key component.</p>
<p>Storage vendors are rapidly moving to <a href="http://blog.fosketts.net/2009/07/01/cloudstuff-stuff-cloud/"  target="_blank">rework their conventional systems for cloud use</a>. Although block and file systems from 3PAR, NetApp, Isilon, Symantec, HDS, HP, and others are useful in cloud environments, unconventional CAS becomes more valuable here. This is where EMC, Mezeo, and Caringo (with Dell) shine, and why HDS bought Parascale, NetApp bought Bycast, and <a href="http://blog.fosketts.net/2010/10/14/overland-acquires-maxiscale/"  target="_blank">what Overland could do with MaxiScale</a>. In the mean time, <a href="http://www.thebiggertruth.com/2010/05/head-in-the-clouds-the-great-value-question/"  target="_blank">&#8220;gateway&#8221; products</a> from <a href="http://www.nasuni.com/"  target="_blank">Nasuni</a>, <a href="http://www.cirtas.com/"  target="_blank">Cirtas</a>, <a href="http://www.storsimple.com/"  target="_blank">StorSimple</a>, <a href="http://www.twinstrata.com/"  target="_blank">Twin Strata</a>, and <a href="http://asigra.com/"  target="_blank">Asigra</a> are awfully interesting.</p>
<h3>Stephen&#8217;s Stance</h3>
<p><a href="http://blog.fosketts.net/2010/04/29/techie-business-schism/"  target="_blank">The storage revolution is coming</a>, whether we in the industry are ready or not. Developers are voting with their feet, targeting cloud storage and application platforms rather than conventional filesystems. Although the market for cloud storage products is slow to develop, the cloud storage concept will eventually dominate the landscape.</p>
<p>It seems most likely that this revolution will decimate the storage industry as we know it today. Unable to push high-margin storage arrays into the ballooning cloud space, product vendors will see their market share eroded by service providers with no use for these expensive systems. Monolithic file and block will soldier on in the new legacy applications, but <a href="http://blog.fosketts.net/2010/05/10/emc-post-infrastructure-future/"  target="_blank">the action will inevitably slip away</a>.</p>
<p>The likely winners will be those who can leverage commodity hardware for scale-out cloud storage use. The proliferation of cloud platforms will settle down, with a few gaining traction and the rest discarded. Then we will see companies like HP, Dell, and Oracle rise to lead the storage sales charts with massive volume shipments to service providers.</p>
<p><em>Disclosure: I used to work for StorageNetworks (which is now defunct) and Nirvanix.</em></p>
<p><em>Image credit: Barcelona Graffiti by </em><a rel="nofollow" href="http://www.flickr.com/photos/aeioux/" ><em>Aeioux</em></a></p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/"  rel="bookmark" class="crp_title">Caringo Bulks Up CAStor For Cloud Services</a></li><li><a href="http://blog.fosketts.net/2009/09/22/zend-simple-cloud-api/"  rel="bookmark" class="crp_title">Zend Simple Cloud API = Freedom!</a></li><li><a href="http://blog.fosketts.net/2009/07/01/cloudstuff-stuff-cloud/"  rel="bookmark" class="crp_title">CloudStuff Versus Stuff in the Cloud</a></li><li><a href="http://blog.fosketts.net/2010/04/29/techie-business-schism/"  rel="bookmark" class="crp_title">The Techie/Business Schism</a></li><li><a href="http://blog.fosketts.net/2010/11/24/automatic-provisioning-overcoming-limits-thin-provisioning/"  rel="bookmark" class="crp_title">Overcoming The Limits Of Thin Provisioning With Automatic Provisioning!</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2010. |
<a href="http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/">From CAS to Cloud: Revolutionary Storage</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/gestaltit/" title="View all posts in Gestalt IT" rel="category tag">Gestalt IT</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2010/10/26/cas-cloud-revolutionary-storage/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>EMC Atmos Versus VMware VDC-OS: Will The Real Cloud Strategy Please Stand Up?</title>
		<link>http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/</link>
		<comments>http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/#comments</comments>
		<pubDate>Mon, 10 Nov 2008 16:03:42 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[Atmos]]></category>
		<category><![CDATA[Azure]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[Centera]]></category>
		<category><![CDATA[Chuck Hollis]]></category>
		<category><![CDATA[CIFS]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[cloud storage]]></category>
		<category><![CDATA[Cloud vServices]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[COS]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[HCAP]]></category>
		<category><![CDATA[Hitachi]]></category>
		<category><![CDATA[Maui]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[NAS]]></category>
		<category><![CDATA[nas storage]]></category>
		<category><![CDATA[network attached storage]]></category>
		<category><![CDATA[network storage]]></category>
		<category><![CDATA[NFS]]></category>
		<category><![CDATA[Nirvanix]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[SOAP]]></category>
		<category><![CDATA[Steve Todd]]></category>
		<category><![CDATA[VDC-OS]]></category>
		<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=1075</guid>
		<description><![CDATA[As I guessed on Friday, EMC has officially announced their Maui Atmos software layer today, calling it the &#8220;industry&#8217;s first COS (cloud-optimized storage) offering&#8221;, &#8220;a new era for IT&#8221;, and &#8220;a new category of storage.&#8221; So the new era for IT is a cloud with globally-distributed object stores with policy management? Great! But I thought [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.fosketts.net/2008/11/07/emc-maui/"  target="_blank">As I guessed on Friday</a>, EMC has officially announced their <span style="text-decoration: line-through;">Maui</span> Atmos software layer today, <a href="http://www.emc.com/products/category/subcategory/cloud-optimized-storage.htm?CMP=ILC-carHP&amp;panel=harnessing+cloud+computin"  target="_blank">calling</a> it the &#8220;industry&#8217;s first COS (cloud-optimized storage) offering&#8221;, &#8220;a new era for IT&#8221;, and &#8220;a new category of storage.&#8221; So the new era for IT is a cloud with globally-distributed object stores with policy management?</p>
<p>Great! But I thought the new era for IT was a cloud with choice, mobility, and application support, as <a href="http://www.vmware.com/technology/virtual-datacenter-os/cloud-vservices/"  target="_blank">trumpeted</a> by EMC&#8217;s VMware subsidiary! Wasn&#8217;t Cloud vServices from VDC-OS supposed to be the <a href="http://blog.fosketts.net/2008/09/16/vmware-virtual-datacenter-operating-system-vdc-os/"  target="_blank">prototype cloud strategy</a> for the datacenter?</p>
<p>What we have here is <strong>a simple clash of marketing</strong> amusingly taking place at (nearly) the same company. VMware figured out how to extend their server virtualization products outside the confines of the data center, and laid that technology out as a strategy with the trendy &#8220;cloud&#8221; name. Meanwhile, mother EMC is working on next-generation content storage software and decides to roll that out as a strategy and also jumps on the &#8220;cloud&#8221; meme. What&#8217;s an IT manager to do?<span id="more-1075"></span></p>
<h3 class="post-subhead">Defining Atmos</h3>
<p>As predicted, EMC&#8217;s Atmos (code-name Maui) is a <a href="http://www.theregister.co.uk/2008/11/10/emc_launches_maui_as_atmos/"  target="_blank">distributed software layer</a> to handle the storage and management of data objects across geographically-dispersed storage devices. EMC&#8217;s Chuck Hollis <a href="http://chucksblog.emc.com/chucks_blog/2008/11/emc-atmos-maui-is-here.html"  target="_blank">demonstrates Atmos</a> with a simple, practical example, perhaps making it sound too much like Akamai but generally getting the point across. You have a data object, write it to Atmos through REST/SOAP or CIFS/NFS, assign some metadata, and the software takes care of data placement for you. It&#8217;ll add local copies, replicate for availability and performance, compress or deduplicate, manage versions, and all sorts of goodies (if you ask it to).</p>
<p>But EMC already has a capable object storage platform, the Centera. We&#8217;ve just got used to the content-addressable storage (CAS) label for object storage (even though this name misses the point of object storage, in my opinion) and now EMC wants us to learn a new label for a somewhat-similar device? Steve Todd, EMC&#8217;s object guy extraordinaire, <a rel="nofollow" href="http://stevetodd.typepad.com/my_weblog/2008/11/atmos-cloud-optimized-storage.html"  target="_blank">lays it out</a>:</p>
<blockquote><p>SAN Value = Centralized, secure multi-tenancy for blocks.</p>
<p><strong><span style="font-weight: normal;">NAS Value = Centralized, secure multi-tenancy for files.</span></strong></p>
<p><strong><span style="font-weight: normal;">CAS Value = Centralized, secure multi-tenancy for objects (content + metadata).</span></strong></p>
<p><strong><span style="font-weight: normal;">COS Value = </span><em><span style="font-weight: normal;">Globalized</span></em><span style="font-weight: normal;">, secure multi-tenancy for content with </span><em><span style="font-weight: normal;">rich policies</span></em><span style="font-weight: normal;">.</span></strong></p>
</blockquote>
<p>Ok, so <strong>the defining capabilities of Atmos are its global scale and rich policies</strong>. And the fact that &#8220;objects&#8221; has become &#8220;content&#8221;, presumably since Atmos can handle traditional NAS (CIFS/NFS) chores as well.</p>
<h3 class="post-subhead">Prayers Answered?</h3>
<p>It sounds like EMC is answering <a href="http://blog.fosketts.net/2008/09/28/we-need-storage-revolution/"  target="_blank">my prayers for a storage revolution</a>, delivering a highly-capable object storage platform that transcends the old limits of blocks, directories, and files. Steve Todd points out that Atmos handles five policy categories out of the box:</p>
<ul>
<li>Replication</li>
<li>Compression</li>
<li>Spin-down</li>
<li>Object de-dup</li>
<li>Versioning</li>
</ul>
<p>So we write some data to Atmos, using either traditional NAS or <a rel="nofollow" href="http://en.wikipedia.org/wiki/Web_2.0"  target="_blank">webby dubby</a> protocols like <a rel="nofollow" href="http://en.wikipedia.org/wiki/SOAP_(protocol)"  target="_blank">SOAP</a>, and can then apply policies in any of these five categories to that data. One can also extend the Atmos to accept other policies, but the absence (out of the box) of concepts like encryption, secure deletion, retention, and access control are surprising.</p>
<p>I am quite puzzled about how practical these policy capabilities will be in the real world. How exactly would an application say &#8220;I want you to compress that file I wrote over NFS just now?&#8221; Hitachi&#8217;s HCAP platform, for example, also has policy capabilities and a NAS front end, and although archiving applications can communicate their policy needs, <strong>I don&#8217;t see lots of current general-purpose applications using it</strong>.</p>
<h3 class="post-subhead">Strategic Storage?</h3>
<p>This brings me to my puzzlement: The default Atmos policies are all general-purpose, production computing ideas, not the special-purpose, archiving and retention needs served by Centera, HCAP, and the rest. So <strong>the Atmos is clearly intended to be a production data storage system</strong>, not an archiving system to compete with Centera.</p>
<p>Since mainstream business applications currently don&#8217;t have any capability to specify policies like these when writing files, and since NAS protocols lack any means to communicate them even if the apps want to, we can conclude that <strong>EMC expects that Atmos users will write special applications to take advantage of it</strong>.</p>
<p>EMC certainly doesn&#8217;t expect that the NAS-capable Atmos will simply replace today&#8217;s distributed NAS solutions. <strong>NAS is a sideshow for Atmos</strong>. The real action will be in the REST/SOAP webby dubby applications that will be written with the platform in mind and will take full advantage of these capabilities.</p>
<p>If this is true, and I <a rel="nofollow" href="http://storagebod.typepad.com/storagebods_blog/2008/11/i-like-a-party-with-a-atmosphere.html"  target="_blank">and others</a> suspect that it is, then <strong>Atmos really isn&#8217;t a game-changing platform unless you change your game</strong>. If you write new applications to store data with SOAP, Atmos is a nice in-house alternative to Amazon S3 or Nirvanix, and offers a very compelling set of data management capabilities. And if you want to set up shop to compete with those service providers, Atmos is a dream come true with <a rel="nofollow" href="http://storagezilla.typepad.com/storagezilla/2008/11/building-emc-atmos.html"  target="_blank">built-in multi-tenancy</a>.</p>
<h3 class="post-subhead">Datacenter Strategy</h3>
<p>So EMC alone has two seemingly competitive datacenter strategies. And then there&#8217;s Microsoft, which announced its <a href="http://dcsblog.burtongroup.com/data_center_strategies/2008/10/waiting-for-the-other-shoe-to-drop.html"  target="_blank">Azure cloud platform</a> recently, and Amazon and the other cloud providers.</p>
<p>So let&#8217;s say you&#8217;re a CIO for a large corporation. Which of the following strategies is more compelling:</p>
<ol>
<li>Use <strong>VMware VDC-OS</strong> to add capabilities and <strong>Cloud vServices</strong> extend your current virtual infrastructure geographically</li>
<li>Recompile and tweak your Windows applications to leverage <strong>Microsoft Azure</strong></li>
<li>Develop new applications to take advantage of the impressive storage capabilities of an in-house <strong>EMC Atmos </strong>system</li>
<li>Point your new applications at a third-party cloud provider like Amazon or Nirvanix</li>
</ol>
<p>IT people are practical. Although we love new technology, we tend to be cautious. We also hate massive software development efforts, and only sanction them when they&#8217;re absolutely necessary. Given these personality traits, I&#8217;d say VDC-OS and perhaps Cloud vServices still stands out as the most likely and practical scenario for the majority of applications and businesses.</p>
<p>This is not to say that EMC Atmos will be a flop. I&#8217;m impressed by the technology, and expect that Atmos will find buyers, just as Centera did. And Atmos might even replace Centera once EMC adds retention policies to it and scales it down as well as up and out. But Atmos will not redefine the datacenter. We&#8217;re stuck with blocks and files, and VMware&#8217;s practical strategy is a winner in that world.</p>
<p><strong>Update:</strong> <a href="http://www.storagerap.com/2008/11/atmos-dead-or-not-dead-innovative-or-repetitive.html"  target="_blank">Marc Farley compares Atmos to WAFS</a>, with ominous implications, and echos my recent question on what is and is not innovative.</p>
<p><strong>Update 2:</strong> Chuck Hollis, Storagezilla, and <a rel="nofollow" href="http://lensblog.typepad.com/ebiz/2008/11/emc-announces-atmos.html"  target="_blank">Len Devanna</a> have all come right out and said that this is only intended for certain customers with massive distributed storage needs, and is not intended as a new datacenter strategy. Even the &#8220;cloudfella&#8221; says &#8220;ciao&#8221;:</p>
<p>
<object width="425" height="344" data="http://www.youtube.com/v/eaqklyv3yrg&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/eaqklyv3yrg&amp;hl=en&amp;fs=1" /><param name="allowfullscreen" value="true" /></object>
</p>
<p><strong>Update 3:</strong> More great information, including <a rel="nofollow" href="http://virtualgeek.typepad.com/virtual_geek/2008/11/whats-the-relat.html"  target="_blank">a reply regarding VDC-OS and Atmos</a> from the one and only Chad Sakac, more great detail about <a rel="nofollow" href="http://stevetodd.typepad.com/my_weblog/2008/11/atmos-policy-under-the-hood.html"  target="_blank">the inner workings of Atmos</a> from Steve Todd, and <a href="http://flickerdown.com/?p=268"  target="_blank">even more info</a> from Dave Graham. Finally, although I think that Cloudfellas video is cute, I wouldn&#8217;t categorize it as viral. But <a rel="nofollow" href="http://lensblog.typepad.com/ebiz/2008/11/beware-flaming-appliances-from-the-sky.html"  target="_blank">those Mozy ads</a> are awesome!</p>
<blockquote><p>See my posts on <a href="http://gestaltit.com/author/stephen/"  target="_blank">Gestalt IT</a> for similar <a href="http://gestaltit.com"  target="_blank">enterprise IT infrastructure commentary</a></p>
</blockquote>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2008/11/07/emc-maui/"  rel="bookmark" class="crp_title">EMC About To Take Us To Maui&#8230;</a></li><li><a href="http://blog.fosketts.net/2010/10/26/caringo-castor-cloud-storage/"  rel="bookmark" class="crp_title">Caringo Bulks Up CAStor For Cloud Services</a></li><li><a href="http://blog.fosketts.net/2008/09/16/vmware-virtual-datacenter-operating-system-vdc-os/"  rel="bookmark" class="crp_title">VMware Virtual Datacenter Operating System: Heavyweight or Hot Air?</a></li><li><a href="http://blog.fosketts.net/2009/03/19/sun-cloud/"  rel="bookmark" class="crp_title">Sun Launches Their Own Cloud, But For Which Market?</a></li><li><a href="http://blog.fosketts.net/2011/04/24/changing-it-organization-roles/"  rel="bookmark" class="crp_title">Changes in Technology Drive Changes in IT Organizations and Roles</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2008. |
<a href="http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/">EMC Atmos Versus VMware VDC-OS: Will The Real Cloud Strategy Please Stand Up?</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2008/11/10/emc-atmos-vmware-vdc-os-cloud-strategy/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Deduplication Coming to Primary Storage</title>
		<link>http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/</link>
		<comments>http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/#comments</comments>
		<pubDate>Tue, 16 Sep 2008 19:28:37 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Computer History]]></category>
		<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Features]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[Atari]]></category>
		<category><![CDATA[Byte]]></category>
		<category><![CDATA[capacity optimization]]></category>
		<category><![CDATA[CAS]]></category>
		<category><![CDATA[Centera]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[data deduplication]]></category>
		<category><![CDATA[Data Domain]]></category>
		<category><![CDATA[deduplication]]></category>
		<category><![CDATA[DR-DOS]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[FilePool]]></category>
		<category><![CDATA[greenBytes]]></category>
		<category><![CDATA[Huffman coding]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[NetApp]]></category>
		<category><![CDATA[Riverbed]]></category>
		<category><![CDATA[single-instance storage]]></category>
		<category><![CDATA[Stacker]]></category>
		<category><![CDATA[VMware]]></category>
		<category><![CDATA[VTL]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=626</guid>
		<description><![CDATA[Although deduplication of storage is nothing new, with Data Domain and other making hay with the technique for years, it has never been ready for prime time - reduction of active primary storage applications like email and databases. Instead, deduplication has been relegated to second- or third-tier status, deduplicating archives and backup data. But change is in the air, and deduplication vendors are starting to bustle towards the bright lights of primary storage.]]></description>
			<content:encoded><![CDATA[<p style="padding-left: 30px;"><em>This is a follow-up to my story, <a href="http://blog.fosketts.net/2008/03/12/de-duplication-goes-mainstream/"  target="_self">De-Duplication Goes Mainstream</a></em></p>
<p>Although deduplication of storage is nothing new, with Data Domain and other making hay with the technique for years, it has never been ready for prime time &#8211; reduction of active primary storage applications like email and databases. Instead, deduplication has been relegated to second- or third-tier status, deduplicating archives and backup data. But change is in the air, and deduplication vendors are starting to bustle towards the bright lights of primary storage.</p>
<h3>Stone Knives and Bear Skins</h3>
<p>We have all been here before, of course. Back at the dawn of the personal computer era, data compression was a hot topic of conversation. I recall being so impressed by an article in <a rel="nofollow" href="http://en.wikipedia.org/wiki/Byte_(magazine)"  target="_blank">Byte</a> (1986:5, p99) outlining <a rel="nofollow" href="http://en.wikipedia.org/wiki/Huffman_coding"  target="_blank">Huffman coding</a> that I tried cooking up an implementation in Atari BASIC. Lossless compression has a magical pull to the geek in many of us &#8211; redundant data just <em>wants</em> to be eliminated!</p>
<div id="attachment_630" class="wp-caption alignright" style="width: 254px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><a href="http://blog.fosketts.net/wp-content/uploads/2008/09/sc0003b3d4.png" ><img class="size-full wp-image-630 " title="Stacker" src="http://blog.fosketts.net/wp-content/uploads/2008/09/sc0003b3d4.png" alt="Stacker dominated the disk compression world - until Microsoft introduced DOS 6.0" width="244" height="254" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Stacker dominated the disk compression world - until Microsoft introduced DOS 6.0</p></div>
<p>Companies soon applied <a href="http://www.zisman.ca/Articles/1993/DOS6.html"  target="_blank">compression to primary storage</a>, especially the limited storage in personal computers. <a rel="nofollow" href="http://en.wikipedia.org/wiki/Stac_Electronics#Microsoft_lawsuit"  target="_blank">Stacker</a> was a hit after 1990, until Microsoft built a workalike, called DoubleSpace, into DOS 6.0 in 1993, leading to a historical lawsuit. I personally used the ADDSTOR disk compression built into DR-DOS 6.0 to stretch two more years out of the 20 MB MFM hard drive in my AT&amp;T PC6300 at <a href="http://wpi.edu"  target="_blank">WPI</a>.</p>
<p>But something funny happened in the late 1990s: Compression began to lose its luster. Compressing data always takes quite a bit of CPU power, but this was offset somewhat by the truncated data transfers and more-efficient file system layout afforded in early PCs. But as disks got larger and faster, using precious CPU time to save space seemed less and less compelling. Today, although nearly every operating system includes built-in compression of files, folders, or perhaps disks, these features are rarely used. And compression was never popular in the performance-sensitive enterprise space.</p>
<h3><strong>Deduplication Has a Nice Ring</strong></h3>
<p>Although traditional fine-grained compression has not been very successful in the enterprise, its lanky cousin, single-instance storage, has long found niche jobs. Applications from databases to email systems to file servers have long had the ability to recognize to requests to store the exact same file or record, and to store just a single instance in this case. Even file systems have the ability to do single instance storage through the use of links, though this is initiated by the user rather than in an automated fashion.</p>
<p>In the late 1990s, FilePool began developing a <a rel="nofollow" href="http://en.wikipedia.org/wiki/Content-addressable_storage"  target="_blank">content-addressable storage</a> device, which was acquired by EMC in 2001. This device, later known as the Centera, was one of a number of storage platforms targeted at the archiving market introduced this decade. At the same time, <a rel="nofollow" href="http://en.wikipedia.org/wiki/Virtual_tape_library"  target="_blank">virtual tape libraries</a> made the jump from the mainframe to open systems. Both devices, being outside the critical path of performance but offering massive capacity, were well-suited to implement advanced <a rel="nofollow" href="http://en.wikipedia.org/wiki/Capacity_optimization"  target="_blank">capacity optimization</a> technologies that combined the concepts of compression with single-instance storage. Thus was created the modern world of data deduplication.</p>
<p>What we think of as deduplication is neither fish nor fowl: It assesses larger &#8220;chunks&#8221; of data than compression technologies, delivering greater capacity savings and potentially reducing performance impact, but is more flexible than single-instancing, recognizing the similarities within files or objects.</p>
<p>But it is still maddeningly difficult to scale deduplication while maintaining performance. Rather than fight to maintain reasonable write throughput, most deduplication products have switched to post-processing, deferring their work to quieter times.</p>
<h3><strong>It&#8217;s Not Just for Breakfast</strong></h3>
<p>Regardless of their methods or underlying technology, no deduplication vendor has stood up to support challenging low-latency or high-throughput production applications, however. <a href="http://blog.fosketts.net/2008/03/12/de-duplication-goes-mainstream/"  target="_self">NetApp was the first to raise the issue of support for production applications</a>, but although they tout the technology for VMware, they haven&#8217;t exactly been shouting from the rooftops to get their A-SIS deduplication technology deployed in other high-I/O applications. And I haven&#8217;t seen Hifn&#8217;s card yet.</p>
<p>Yesterday, I mentioned that greenBytes was adding deduplication to their ZFS-based storage array for primary data. And now <a href="http://www.theregister.co.uk/2008/09/16/deduplicating_primary_storage/"  target="_blank">Riverbed has fired another shot</a> over the bow, repurposing their (deduplicating) WAN accelerator product for primary (file) storage. They might be able to pull it off, too, since they have a long list of customers who are already enjoying the technology in production. It&#8217;s not a stretch to suggest that Riverbed&#8217;s appliances can scale to handle production data loads. Although it&#8217;s file-only, I can imagine quite a few scenarios where this tech could really yield benefits. Could we come full-circle, with deduplication finally reaching the enterprise storage world?</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2008/09/25/deduplication-ready-prime-time/"  rel="bookmark" class="crp_title">Is Deduplication Ready for Prime Time?</a></li><li><a href="http://blog.fosketts.net/2011/09/22/data-reduction-condensed-version/"  rel="bookmark" class="crp_title">Data Reduction: the Condensed Version</a></li><li><a href="http://blog.fosketts.net/2008/09/15/greenbytes-embraces-extends-zfs/"  rel="bookmark" class="crp_title">greenBytes Embraces and Extends ZFS</a></li><li><a href="http://blog.fosketts.net/2009/02/05/compression-encryption-deduplication-replication/"  rel="bookmark" class="crp_title">Compression, Encryption, Deduplication, and Replication: Strange Bedfellows</a></li><li><a href="http://blog.fosketts.net/2011/05/27/storage-decisions-chicago/"  rel="bookmark" class="crp_title">Storage Decisions Chicago: All About Capacity Optimization</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2008. |
<a href="http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/">Deduplication Coming to Primary Storage</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/computerhistory/" title="View all posts in Computer History" rel="category tag">Computer History</a>, <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/features/" title="View all posts in Features" rel="category tag">Features</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2008/09/16/deduplication-primary-storage/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

