<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>Stephen Foskett, Pack Rat &#187; data growth Archives  &#8211; Stephen Foskett, Pack Rat</title>
	<atom:link href="http://blog.fosketts.net/tag/data-growth/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.fosketts.net</link>
	<description>Understanding the accumulation of data</description>
	<lastBuildDate>Fri, 10 Feb 2012 17:40:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
<atom:link rel="hub" href="http://pubsubhubbub.appspot.com" />
	<atom:link rel="hub" href="http://superfeedr.com/hubbub" />
			<item>
		<title>Symantec&#8217;s Thin API: The Plot Thickens</title>
		<link>http://blog.fosketts.net/2008/10/24/symantec-thin-api/</link>
		<comments>http://blog.fosketts.net/2008/10/24/symantec-thin-api/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 14:16:49 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[Virtual Storage]]></category>
		<category><![CDATA[3PAR]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[data growth]]></category>
		<category><![CDATA[data management]]></category>
		<category><![CDATA[file system]]></category>
		<category><![CDATA[HDS]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[I/O deduplication]]></category>
		<category><![CDATA[SmartMove]]></category>
		<category><![CDATA[Storage Foundation]]></category>
		<category><![CDATA[storage management]]></category>
		<category><![CDATA[Symantec]]></category>
		<category><![CDATA[thin provisioning]]></category>
		<category><![CDATA[Thin Reclamation]]></category>
		<category><![CDATA[utilization]]></category>
		<category><![CDATA[Veritas Storage Foundation]]></category>
		<category><![CDATA[VMware]]></category>
		<category><![CDATA[volume manager]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/?p=933</guid>
		<description><![CDATA[Last week, I lauded Symantec for introducing an API in Storage Foundation which will interact with the thin storage capabilities of supported arrays. Since then, I&#8217;ve learned more about this capability, and I am writing this update to share that knowledge. As I noted last week, the press release was a bit hard to follow and [...]]]></description>
			<content:encoded><![CDATA[<p>Last week, <a href="http://blog.fosketts.net/2008/10/16/symantecs-thin-api-step-direction/"  target="_self">I lauded Symantec for introducing an API in Storage Foundation</a> which will interact with the thin storage capabilities of supported arrays. Since then, I&#8217;ve learned more about this capability, and I am writing this update to share that knowledge. As I noted last week, the press release was a bit hard to follow and comprehend (and <a href="http://www.theregister.co.uk/2008/10/20/3par_symantec_help/"  target="_blank">not just for me</a>), and one of my initial assumptions about the API turned out to be wrong. I also received a few comments from interested folks pointing out some more pros and cons of this technology.</p>
<p>First, let&#8217;s clarify just which products and capabilities Symantec is offering here:</p>
<ul>
<li>Veritas Storage Foundation version 5.0MP3 for <strong>Unix/Linux</strong> includes <strong>SmartMove</strong> and the <strong>Thin Reclamation API</strong></li>
<li>Veritas Storage Foundation for <strong>Windows</strong> 5.0 only includes <strong>SmartMove</strong> at this point, but it will be updated to include Thin Reclamation at some point in the coming year</li>
</ul>
<p>Although there is no real information on Symantec&#8217;s web site about all this yet, Symantec&#8217;s director of Storage Management and High Availability, Sean Derrington, assures me that their software is available now. Although no compatible arrays are in end-user hands, 3PAR will update their T-Class firmware to support the API shortly, and HDS and HP are on the way as well.<span id="more-933"></span></p>
<h3 class="post-subhead">Thin Aware Software</h3>
<p>Next, contrary to what I inferred from the announcement, <strong>there is no native thin provisioning capability</strong> in the file system or volume manager. So the first item in my list is right out. However, the volume manager is now &#8220;thin aware&#8221;, which means that it will communicate up to the file system and down to the array to coordinate more effective use of space.</p>
<p>When the volume manager is used with <strong>Veritas File System (VxFS)</strong> on UNIX or <strong>NTFS</strong> on Windows Server 2003 or 2008, it will automatically keep track of deleted files and will pass this information down the stack to the array. This is a major piece of functionality to add, especially to NTFS, &#8220;hole punching&#8221; (<a href="http://blogs.netapp.com/shadeofblue/2008/10/hole-punching-f.html"  target="_blank">like NetApp</a>) to maximize thin provisioning.</p>
<p>The Storage Foundation tools have also been updated to properly report on thin provisioned volumes. For example, the following screenshot shows three disk devices where encl1 supports thin reclamation and encl0 does not.</p>
<pre><span style="font-family: 'Lucida Grande'; line-height: 19px; white-space: normal;">#</span> vxdisk list
DEVICE        TYPE   DISK          GROUP         STATUS
encl0_0       auto   encl0_0       mydg online   thin
encl1_0       auto   encl1_0       mydg online   thinrclm
encl1_1       auto   ecnl1_1       mydg online   thinrclm</pre>
<h3 class="post-subhead">Thin Reclamation API</h3>
<p>The Veritas Thin Reclamation API allows the Storage Foundation volume manager and file systems to communicate with <strong>thin-capable arrays</strong> when data is deleted on thin-ified LUNs, maintaining their thin-ness as you go. When a file is deleted, the file system will communicate to the volume manager that that space is no longer needed. When the server administrator runs the &#8220;vxdisk reclaim&#8221; or &#8220;fsadm –R&#8221; commands, the volume manager will communicate this information to the array (using SCSI commands) that any vacated disk blocks can now be reclaimed. Symantec expects folks to set up a cron job to reclaim space, or perhaps just run it when they see the need.</p>
<p>This is brilliant stuff, and ought to make thin provisioning shine in terms of array utilization. In an environment of thin-enabled Veritas volumes and supported storage arrays, the amount of space used on an array will be awfully close to the amount of space used in the file systems. This is a massive win <strong>- a capacity gain of on the order of 50%-70%</strong> in an average environment!</p>
<blockquote><p>For more on this topic, see my recent post on <a href="http://blog.fosketts.net/2008/10/01/storage-utilization-waterfall-raw-usable/"  target="_self">storage utilization</a></p>
</blockquote>
<p>If the storage array fully supports Symantec&#8217;s API, the tools will also report physically allocated storage behind thin and thin_reclaim devices.</p>
<pre># vxdisk –o thin list
DANAME        DISK SIZE(Mb)        PHYS_ALLOC(Mb)       DISK GROUP TYPE
encl0_0       2000                 50 mydg              thin
encl1_0       200                  50 mydg              thinrclm
encl1_1       500                  500 mydg             thinrclm</pre>
<h3 class="post-subhead">SmartMove</h3>
<p>SmartMove is Symantec&#8217;s new capability for online migration from &#8220;thick&#8221; to thin LUNs. It is included in Storage Foundation for Unix/Linux and Windows and works with <strong>any thin storage array</strong>, not just those that support the API. This is basically a tweak to the old storage migration support we have all known and relied on in Veritas Storage Foundation for over a decade, except that it&#8217;s <strong>smart enough to not request blocks that it won&#8217;t use</strong>. One could theoretically &#8220;SmartMove&#8221; a volume regularly to reclaim space without using the API at all, but those commands are sure a lot simpler.</p>
<p>Note that <strong>SmartMove speeds up migration too, even for thick volumes</strong>! When you use a SmartMove-enabled version of Storage Foundation to move a volume, it will only send the blocks that have changed over the wire. This reminds me a little of VMware&#8217;s new I/O deduplication capability talked about at VMworld, but it&#8217;s focused only on migrations, not other I/O situations.</p>
<blockquote><p>For more on this topic, see my recent post on <a href="http://blog.fosketts.net/2008/09/19/what-vmware-vdc-os-vstorage/"  target="_self">VMware vStorage</a></p>
</blockquote>
<h3 class="post-subhead">The Plot Thickens</h3>
<p>So I was wrong about one item, but the other two remain true. Is Symantec&#8217;s new capability a winner? I give it a silver medal &#8211; it&#8217;s good stuff, but some issues remain.</p>
<ol>
<li>My primary concern remains &#8211; <strong>thin provisioning does nothing to address the lack of storage management</strong> that is so prevalent. It enables greater utilization of capacity, but does nothing to control how that capacity is used. This isn&#8217;t a beef with Symantec&#8217;s Veritas Storage Foundation or 3PAR or HDS or EMC or anyone in the thin industry, really. Instead, it is a wake-up call to all of the storage organizations out there who have <a href="http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/" >filesystems full of uncontrolled junk</a>!</li>
<li>My second concern is the <strong>lack of capacity management</strong>. Thin provisioning is a lie, promising more capacity than is available. This might be acceptable in certain controlled circumstances like operating system or application volumes, but telling end users that they have plenty of available space is <a href="http://blog.fosketts.net/2007/08/16/a-seat-at-the-table/" >a recipe for disaster</a>. Storage use is like air &#8211; it expands to fill all available volume. Without capacity management, your thin volumes will be &#8220;overdrawn&#8221; and your storage &#8220;account&#8221; will be bankrupt.</li>
<li>Then there is the issue of proprietary APIs versus standards. Let me say right away that <strong>I always support standards over proprietary technology</strong>. But, at the same time, given the choice between nothing and something, I&#8217;ll take the proprietary API. Thin provisioning is a good idea with poor implementation. This API helps to make it useful in the real world, and having a market leader like Symantec behind it makes it all the more relevant. I certainly hope the entire storage industry will come up with a standard thin API, and when that happens I hope Symantec will support it. Until then, at least we have something.</li>
</ol>
<p>I will be writing more about thin provisioning in the coming weeks. Until then, I continue to applaud Symantec, 3PAR, HDS, and HP for their work in making this technology somewhat more practical. Now how about VMware, Microsoft, Sun, and the Linux guys <a rel="nofollow" href="http://storagebod.typepad.com/storagebods_blog/2008/10/thin-provisioning---saviour-of-the-universe.html"  target="_blank">get some thin technology going</a>, too?</p>
<blockquote><p>See my posts on <a href="http://gestaltit.com/author/stephen/"  target="_blank">Gestalt IT</a> for similar <a href="http://gestaltit.com"  target="_blank">enterprise IT infrastructure commentary</a></p>
</blockquote>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2008/10/16/symantecs-thin-api-step-direction/"  rel="bookmark" class="crp_title">Symantec&#8217;s Thin API Is A Step In The Right Direction</a></li><li><a href="http://blog.fosketts.net/2011/01/06/bridge-veritas-thin-provisioning-api/"  rel="bookmark" class="crp_title">The Bridge: Veritas Thin (Provisioning) API</a></li><li><a href="http://blog.fosketts.net/2007/07/30/how-thin-are-you/"  rel="bookmark" class="crp_title">How Thin Are You?</a></li><li><a href="http://blog.fosketts.net/2007/07/23/brocade-adds-thin-provisioning/"  rel="bookmark" class="crp_title">Brocade Adds Thin Provisioning</a></li><li><a href="http://blog.fosketts.net/2008/09/02/3pars-thin-un-provisioning/"  rel="bookmark" class="crp_title">3PAR&#8217;s Thin Un-Provisioning is Slightly Less Bad</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2008/10/24/symantec-thin-api/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2008. |
<a href="http://blog.fosketts.net/2008/10/24/symantec-thin-api/">Symantec&#8217;s Thin API: The Plot Thickens</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>, <a href="http://blog.fosketts.net/category/everything/virtualstorage/" title="View all posts in Virtual Storage" rel="category tag">Virtual Storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2008/10/24/symantec-thin-api/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Sailing the Titanic (Why We Need ILM and Then Some!)</title>
		<link>http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/</link>
		<comments>http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/#comments</comments>
		<pubDate>Tue, 24 Jul 2007 14:40:26 +0000</pubDate>
		<dc:creator>Stephen</dc:creator>
				<category><![CDATA[Enterprise storage]]></category>
		<category><![CDATA[blogketing]]></category>
		<category><![CDATA[data classification]]></category>
		<category><![CDATA[data growth]]></category>
		<category><![CDATA[green storage]]></category>
		<category><![CDATA[ILM]]></category>
		<category><![CDATA[SRM]]></category>
		<category><![CDATA[tiered storage]]></category>

		<guid isPermaLink="false">http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/</guid>
		<description><![CDATA[Without getting into the debate on blogketing (I&#8217;ll save that for another post), I was pretty impressed by Chuck Hollis&#8217; recent post on ILM. I think he&#8217;s made a good discussion of the wherefores of ILM, and maybe counteracted a bit of the prevailing anti-ILM argument. I&#8217;ve been in the trenches on storage content (aka [...]]]></description>
			<content:encoded><![CDATA[<p>Without getting into the debate on blogketing (I&#8217;ll save that for <a href="http://blog.fosketts.net/2007/07/23/blogketing-revisited/" >another post</a>), I was pretty impressed by Chuck Hollis&#8217; <a rel="nofollow" href="http://chucksblog.typepad.com/chucks_blog/2007/07/so-where-is-ilm.html" target="_blank" >recent post</a> on ILM. I think he&#8217;s made a good discussion of the wherefores of ILM, and maybe counteracted a bit of the prevailing <a href="http://www.drunkendata.com/?p=1231" target="_blank" >anti-ILM argument</a>.</p>
<p>I&#8217;ve been in the trenches on storage content (aka data) for a long time. I, too, have often reverted to the old &#8220;gigs of MP3s and porn&#8221; argument from time to time. But I&#8217;ve done enough filesystem assessments at real companies to realize that that&#8217;s not really the norm. In fact, I&#8217;ve rarely found much porn, music, video, or jokes on full-up corporate file servers. And I&#8217;ve analyzed enough storage environments to know that, while file servers are big, they&#8217;re not normally the majority user of storage in large data centers.</p>
<p>On the contrary, most enterprise storage is taken up by business applications, though not necessarily critical data. Email, backup, and certainly user file servers are big space users. But give me a few Oracle instances, source code repositories, or image processing servers, and watch those applications shrink in significance.</p>
<p>No matter what the application, though, the real issue with storage growth (and ILM) is the (in)ability of IT managers to do anything about it. Let&#8217;s say we had permission to delete really inappropriate data, which is <em>not</em> a sure thing. Would we IT folks even be able to recognize it? How would we locate it? Can we even view user files without violating user trust, company privacy policies, or even laws? Many countries (yes, not all data is in the USA), regulate access to data even inside a company.</p>
<p>Now let&#8217;s move into grayer areas of &#8220;unnecessary&#8221; corporate data. Many storage administrators can&#8217;t even name the applications that take up all that space, let alone understand the intricacies of the data under management.  To make a timely (and tired) Harry Potter analogy, IT are the <a rel="nofollow" href="http://en.wikipedia.org/wiki/House_elf" target="_blank" >house-elves</a> of the business &#8211; powerful but subservient, with little input into what happens above and around them.  I&#8217;ve talked to business people who don&#8217;t want IT to have any input, relegating them to order takers and laborers.</p>
<p>This is a dangerous slide, however.  Lots of people have the capability to take IT orders and keep the lights on,  a realization that leads to outsourcing.  IT pros must prove their worth to the business in order to remain relevant and irreplaceable!</p>
<p>ILM is one way to do that.  To get back to Chuck&#8217;s post, we need to take the reins and try to understand data better.  We need to pick certain applications that lend themselves to automated data classification and tiered storage and try to get them under control.  Email is a great candidate, and that&#8217;s why email archiving applications have taken off recently.  File servers are coming along, too, especially with file virtualization in the ascendancy.</p>
<p>I&#8217;m particularly excited about what a smart IT manager I know called the &#8220;second wave&#8221; of SRM tools.  Rather than just collecting stock metadata (age, name, owner, etc), the latest filesystem scanning tools look inside a file, trying to better classify them.  Let&#8217;s say 1/4 of your file server is made up of Microsoft Word, Excel, and PowerPoint documents.  What can you do about that unless you can identify which are critical and which are not?  Each business will have its own criteria, and you need a flexible tool to scan them all and report back to you before you can &#8220;ILM&#8221; them.  That&#8217;s what lots of software vendors are currently working on, and though we&#8217;re at an early stage still, the results are promising.</p>
<p>Sadly, though, we in IT may soon find that we just can&#8217;t delete anything.  Even totally banned content like porn could be critical to a legal case against an employee,  and it won&#8217;t be long before we are expected to keep everything that shows up on our servers for a very long time.  Most companies have policies for hardcopy document retention, and many are currenyly diving into the world of data policy as well.  The default policy may be &#8220;keep until we decide what to do with it&#8221;, and this could cause the current trend of storage growth to accelerate!</p>
<p>If we can&#8217;t delete data, we will be forced to sail the Titanic rather than sink it.  Small companies can benefit most from the falling price of storage, since the entire storage footprint for a little shop is often under a terabyte.  But larger organizations will find that they need to start tiering their storage, and quickly in order to keep prices under control.</p>
<p>And then there&#8217;s green storage.   Again, Mr. Toigo makes the very valid point that the problem is in the business, not in the hardware we use.  But if we can&#8217;t do anything about data growth for the time being, we had better start tackling the technical challenges we face.  I&#8217;ve talked to many IT folks who are very worried about data center space, as well as the terrifying trio of heat, power, and cooling.  For them, green technologies are no laughing matter!  If you can&#8217;t get any more power, you have to lower your per-GB requirement and quickly.</p>
<p>It&#8217;s easy to say &#8220;understand your data and delete some&#8221;, but hard for IT pros to  actually do it.  Until we can tackle the strategic issue of data growth, we&#8217;ll have to continue fighting the tactical problems of storage.</p>
<div id="crp_related"><h3>You might also want to read these other posts...</h3><ul><li><a href="http://blog.fosketts.net/2008/09/05/answering-email-archiving-questions/"  rel="bookmark" class="crp_title">Answering Your Email Archiving Questions</a></li><li><a href="http://blog.fosketts.net/2007/08/01/chuck-hollis-gets-it/"  rel="bookmark" class="crp_title">Chuck Hollis Gets It!</a></li><li><a href="http://blog.fosketts.net/2008/02/07/how-long-should-companies-retain-email/"  rel="bookmark" class="crp_title">How Long Should Companies Retain Email?</a></li><li><a href="http://blog.fosketts.net/2011/04/10/deletion-dilemma/"  rel="bookmark" class="crp_title">The Deletion Dilemma</a></li><li><a href="http://blog.fosketts.net/2010/09/24/fundamental-practices-enterprise/"  rel="bookmark" class="crp_title">Four Fundamental Best Practices for Enterprise IT</a></li></ul></div><script src="http://feeds.feedburner.com/~s/sfoskett?i=http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/" type="text/javascript" charset="utf-8"></script><hr />
<p><small>© sfoskett for <a href="http://blog.fosketts.net">Stephen Foskett, Pack Rat</a>, 2007. |
<a href="http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/">Sailing the Titanic (Why We Need ILM and Then Some!)</a>
<br/>
This post was categorized as <a href="http://blog.fosketts.net/category/everything/enterprisestorage/" title="View all posts in Enterprise storage" rel="category tag">Enterprise storage</a>. Each of my categories has its own feed if you'd like to filter out or focus on posts like this.<br/>
</small></p>]]></content:encoded>
			<wfw:commentRss>http://blog.fosketts.net/2007/07/24/sailing-the-titanic-why-we-need-ilm-and-then-some/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

