Intelligently placing data into a variety of different formats and across geographical locations is non-trivial. With data protection, however, this isn’t just a nice to have; it’s often a functional design requirement. Doing so provides layers of safeguards against data-specific failures as well as local or regional catastrophes. In this fifth and final deep dive series post, I’m going to pick apart how Converged Data Management offers a truly Cloud Native experience for data protection workflows, and how that differs from the traditional approaches.
There’s a fundamental difference between adapting a platform to take advantage of cloud data services, such as public object storage with Amazon S3, versus natively making it part of the platform. This difference can be extrapolated into one metric – simplicity. The amount of complexity that surrounds a platform drives greater inefficiencies and increased chances for error. Additionally, the base foundation of a platform ultimately dictates what features and properties are available for a long-term strategy. Without re-writing the platform from scratch – which is something largely avoided in the enterprise market – the choices are limited. After all, the spin-out and spin-in model used by large corporations to innovate doesn’t exist without reason.
In the case of Coverged Data Management, being Cloud Native means that ingested data is designed at the lowest levels to be placed into some sort of cloud-like data service. I’ll just call it object storage, since “cloud” is a nebulous term that often distills itself to mean “someone else’s data center.” Additionally, object storage can also be provisioned and consumed in both on- and off-premises flavors, or even a hybrid mixture of the two. There’s very few caveats in place to solve design challenges (requirements, risks, constraints, and assumptions) using a scalable storage back-end, and so it’s a direction worth exploring.
It’s important, I believe, to contrast this against systems that were designed for magnetic tape, and perhaps even spinning disk, as their primary archive target for the solution’s fabric. For these legacy architectures, adding a cloud data target is often just another repository to store a flat file or virtual tape file. Cloud storage is simply a bucket of space, and much of the intelligence that can be extrapolated from object storage is ignored. This is a shame, since using objects efficiently is really the only way to squeeze every drop of value from such a system. It also limits the ability to restore data from a “cloudy” object store in a native fashion – often, the files must be pulled down into the data center, cracked open, and single files or folders are restored to the intended location.
Pulling data from an external object storage has some cost ramifications in terms of bandwidth consumed. Most public providers make a decent bit of coin by allowing unlimited data to enter the data center while charging a fee for the amount of data that leaves. The amount of data that traverses the wire has a direct correlation to the price of restoring the data. When you just need a single file, having to pull down a much larger set of data – such as a VMDK or an entire VM – is expensive, not to mention somewhat harsh on the available bandwidth between the external object storage and the data center requesting the restore.
Getting data into the cloud is for amateurs. Getting data back out is for experts. With Converged Data Management, only the unique, deduplicated, and compressed data needed to fulfill a restore request is transmitted over the WAN. In the case of a single file store, this translates into barely sipping the cloud for data. Assuming most of the file already exists on-premises, which is almost always the case for a data that lives in a globally deduplicated, scalable storage system, only a very small piece of the file – the missing deduplicated and compressed differences – needs to come back from the cloud. The result is a very peppy restore that is easy on the wallet and available WAN bandwidth.
Now that I’ve unwrapped the five major features of Converged Data Management, I invite you to take a deeper look into the solution that Rubrik has to offer. If you’re looking to dramatically simply your data protection strategy, ditch the various silos needed to perform backups and restores, and want to empower your team with slick RESTful APIs for lifecycle management, it’s worth taking a look or requesting a demo. I’ll also be releasing a number of short-form webinars to dig deeper into our r300 Appliance and help paint a picture of what life is like when you use a next-generation platform to protect your data .