Thinking About Storage In a New Way, From Cloud to Flash, with Dropbox and Fusion-io

July 23, 2013 By Stephen 1 Comment

I’ve been a storage revolutionary for quite a while, looking for new ways of data storage rather than technologies that perpetuate the same old approaches. That’s why I’m excited about the implications of two very different API access methods announced by Dropbox at DBX and by Fusion-io today at OSCON.

We need smarter, more integrated storage.

Dropbox App Storage

On the consumer side, there’s Dropbox, who announced Sync and Datastore APIs at their DBX conference this month. These APIs are interesting in and of themselves, but more so when one takes a big picture look at them: Dropbox is challenging the whole concept of mobile device storage, suggesting that temporary caches of a cloud datastore are more relevant.

It would be foolish to argue that distributed sync and key/value datastores aren’t taking over the mobile world, but these things are devilishly difficult to get right. Witness Apple’s continuing stumbles with iCloud, for example. But Dropbox has done a fantastic job of making sync work in the real world, and now they’re opening those systems to developers.

http://vimeo.com/70089044

But Dropbox is also explicitly challenging the whole notion of file-based storage. This is another huge, and overdue, revelation. Why bother with files when what today’s apps really need is key/value lookups and blob storage? Watch the keynote above and consider what they’re saying: It’s a new world, and this is what storage ought to look like.

Fusion-io NVM Access

After taking a week to digest the implications of Dropbox’s mobility-enhancing storage strategy, I was contacted by Fusion-io regarding their announcements at Open Source Convention (OSCON) 2013. After a discussion with Brent Compton, Sr. Director of Product Management, I am convinced that this seemingly-unrelated announcement actually has a lot in common!

Fusion-io is announcing three contributions to open source:

A key-value interface to flash, NVMKV – This is an API spec and library along with source code enabling applications directly to store data in non-volatile memory (e.g. Fusion-io flash cards) using a key/value pair rather than conventional file or block I/O.
A modification of the Linux VM subsystem enabling better use of demand paging from non-volatile memory. Paging was basically given up for dead among Linux server guys, but makes a whole lot more sense in a NVM world!
API specs for the Fusion-io flash translation layer enabling atomic access to non-volatile memory, including vectored atomic writes. This has already been implemented in MariaDB 5.5.31 and Percona Server 5.5.31-30.

Let’s take a moment to consider the Fusion-io announcement in a larger context: They’re not talking about conventional storage paradigms anymore. Instead, Fusion-io is pushing specific enhancements to real-world applications to use storage in a new way. This is the Internet datacenter flipside of the Dropbox announcement!

Here’s an example: MySQL databases (with Innodb) use a double write technique to ensure consistency: They write all data twice, then perform an fsync(), then proceed knowing the data is safely written. This is necessary because block storage can result in partial writes: Perhaps the system crashes after one or two 4K blocks were written, losing the remaining blocks and leaving the data in an inconsistent state. But Fusion-io’s atomic write API pushes responsibility for data consistency to the storage device: Issue an atomic write and the ioMemory card will ensure the data has been written completely. So MySQL can skip the double write and just commit data directly.

Atomic writes are practical for non-volatile memory devices because this is how they already function internally: Flash memory is handled in this way by the controller, so it’s a simple matter to expose this to the application and enable atomic writes. This has never been possible before because spinning disks just don’t work this way!

Fusion-io can even do vectored I/O, enabling multiple disjoint buffers to be updated in one atomic operation. Known as scatter-gather, this is another new paradigm for data storage since disks simply could not do this.

Then there’s the key/value API. This has exciting implications for databases, sure, but it goes well beyond that. Just like mobile applications, today’s “big data” systems (yeah, I said it) need key/value access not just block storage. This has long been handled by intermediary applications (databases) and those will likely continue. But instead of managing data on block storage, databases can commit a shard to a NVM key/value interface like the one Fusion-io just offered.

Stephen’s Stance

It’s not that we need a new storage array or company. Rather, we need a whole new way of “doing storage” that reflects the changing reality of both clients (mobile devices, web applications, etc) and storage targets (flash vs. disk). We already live in this new world of key/value datastores but storage technology has been slow to adapt. That’s why I’m excited about Dropbox, Fusion-io, and so many other “new ideas” companies!

It’s important to recognize that the “old school” companies aren’t completely unaware of this shift. The SNIA NVM Programming working group includes lots of familiar names in storage (Dell, EMC, HP, IBM, Intel, NetApp, Oracle, etc), all looking beyond today’s spinning disks. Just like me, they recognize that non-volatile memory is the future.

You might also want to read these other posts...

Ranting and Raving About the 2018 iPad Pro

I remain enthusiastic about the iPad Pro, despite getting a scratched screen and my concerns about durability. It’s a worthy successor to the original and offers enough improvements that I’d recommend the upgrade for just about anyone who uses their iPad for serious work. It’s still not yet a laptop replacement, but this is due more to a lack of desktop-class software for iOS than anything in Apple’s control.

The 2018 iPad Pro is a Beast!

The third-generation iPad Pro is a great machine but also a bellwether of change at Apple. It will be very hard for the rest of the mobile and client computing industry to keep up with this kind of progress!

Microsoft’s Overlooked Innovation

February 15, 2010

It’s fun to bash Microsoft. It’s easy, too, with Apple solidly conquering the high end of the PC and mobile markets and Google’s command of the Internet. But how fair are these articles skewering Microsoft, such as “Microsoft’s chronic lack of innovation” published today at Techworld? I suggest that Microsoft innovates as well as, if not better than, any other massive company. But no one innovates like an outsider.

Ten Terrible Apple Products

June 14, 2012

I’m often accused of being an Apple fanboy. While it’s true that I love my vast selection of fruity products from Cupertino, I’m not blind when the company makes mistakes. In fact, I think Apple’s mistakes are as enlightening as their successes: They reveal a company that is fallible, sometimes learning but often allowing the junk to rot far longer than other companies would.

Co-Processors, GPGPU, and Heterogeneous Computing

June 26, 2017

I’ve been thinking a lot lately about microprocessors, from the many-core CPUs that AMD and Intel introduced recently to the massively scalable GPGPU processing that’s taking machine learning by storm. After years of consolidation on commodity x86 CPUs, it seems that the computing paradigm is turning again to specialized offload processors. This trend towards heterogeneous computing will change the face of hardware, from mobile devices to the datacenter.

We Live in the Future: Robotic Cat Litter Boxes!

May 8, 2010

This post is a bit of a break from my usual gadget-fest, but the object in question isn’t that far off: It requires electricity, costs more than average humans can justify, and simplifies a task we’ve all been doing fine up until now. That’s right: An overly-expensive electric cat litter box. Predictably, I love it.

Hands-On Review: Unicomp Spacesaver M Keyboard for Mac

July 3, 2012

I would not hesitate to recommend the Unicomp Spacesaver M to Macintosh users used to an original IBM Model M, and I am admittedly a tough customer. I wish that Unicomp would update their website, packaging, logo, and keyboard graphics, but none of this really matters as your fingers press the keys. If any keyboard is worth $100, it is the Unicomp Spacesaver M!

FCoE vs. iSCSI – Making the Choice

May 20, 2011

iSCSI is an excellent choice in situations where Fibre Channel investment is nonexistent or badly in need of wholesale upgrade. FCoE, on the other hand, is likely to take over in high-end enterprise shops. It is relentlessly promoted by major vendors, and it seems that they will force the upgrade eventually.

Scaling Storage Is Hard To Do

June 4, 2013

Data storage isn’t as easy as it sounds, especially at enterprise or cloud scale. It’s simple enough to read and write a bit of data, but much harder to build a system that scales to store petabytes. That’s why I’m keenly focused on a new wave of storage systems built from the ground up for scaling!

It’s Time To Move Beyond Passwords (Especially On Web Sites)

January 8, 2016

Sure, single sign-on puts all your eggs in one basket. But this is vastly preferable to trusting that hundreds of third-party baskets are secure, especially when they prove on a weekly basis that they aren’t! It’s time to put distributed passwords behind us and switch to systems like SAML, both for businesses and consumers.

Dropbox App Storage

Fusion-io NVM Access

Stephen’s Stance

You might also want to read these other posts...

Reader Interactions

Leave a Reply