February 12, 2012

Microsoft Adds Data Deduplication to NTFS in Windows 8

Windows 8 server editions will include a filter driver for NTFS for data deduplication

The next version of Microsoft Windows Server includes integrated data deduplication technology. Microsoft is positioning this as a boon for server virtualization and claims it has very little performance impact. But how exactly does Microsoft’s de-duplication technology work?

How Does Dropbox Store Data?

On my Mac, Dropbox clearly uses a 4 MB "chunk" size for deduplication

Dropbox recently clarified (via their blog and privacy policy) that they “de-duplicate” user files. This has been known for quite a while, and is obvious to anyone who’s had a large file “upload” instantly. But how exactly does Dropbox store files? Are they really de-duplicated or just single-instanced? I set out to discover the answer.

The Three Requirements To Overcome Inertia

Once something is in place, it's hard to get it to move again

In Philosophiæ Naturalis, Sir Isaac Newton defined inertia. Although he was referring to physical objects, the power of inertia affects companies, markets, and relationships in the same manner. Humans are creatures of habit, and change is challenging. When faced with a choice of continuing along the same road or branching off in a new direction, most will choose familiarity.

IBM’s Storwize V7000: 100% SVC; 0% Storwize

Green = SVC 5; Pink = SVC 6.1. No Storwize.

Today, IBM alerted the world that they had not fallen asleep at the wheel by kicking out an awfully-impressive midrange storage array, the Storwize V7000. This seems like an excellent device, filled with proven engineering borrowed from the successful SAN Volume Controller (SVC) line of storage virtualization products. But closer examination (and IBM’s own Tony Pearson) reveal that it contains exactly nothing from their Storwize acquisition apart from the name.

Is Deduplication Ready for Prime Time?

Deduplication is here for backup, but it is not yet ready for prime time in primary storage applications

Deduplication Coming to Primary Storage

Stacker dominated the disk compression world - until Microsoft introduced DOS 6.0

Although deduplication of storage is nothing new, with Data Domain and other making hay with the technique for years, it has never been ready for prime time – reduction of active primary storage applications like email and databases. Instead, deduplication has been relegated to second- or third-tier status, deduplicating archives and backup data. But change is in the air, and deduplication vendors are starting to bustle towards the bright lights of primary storage.

greenBytes Embraces and Extends ZFS

I’ve long hollered that ZFS is a real storage revolution in the making, but recognized that it still had a way to go before replacing UFS, HFS+, and most volume managers. Well, a little Rhode Island company called greenBytes comes out of stealth today to announce that they’re doing just that – taking the solid [...]

Jargon Watch: EMC 3D = Data Deduplication

Watching the announcements coming out of EMC World today, one bit of jargon stuck out at me:  The EMC bloggers are starting to refer to “data deduplication” as “3D”.  I had never heard this terminology before yesterday, but the EMCers are all using it, so it must be a popular term inside that company.  So [...]