How We Built More Efficient Data Archival with Cloud
The move to cloud is no longer a question of if but rather when. However, enterprises are still confused on how to adopt a cloud strategy within their own environments. As our CEO Bipul Sinha stated at the Looking AHEAD Tech Summit, in order to increase cloud adoption, “companies need to create killer applications to leverage the cloud.” At Rubrik, we create applications that help enterprises transition to cloud seamlessly. The first step in the path is to archive the backups.
The challenge of archiving to public cloud is ensuring that data can be pulled down into an on-premises location without breaking the bank or your recovery time objectives. This is where Rubrik works its magic. When Rubrik manages your data, it keeps a record of the metadata that is quickly accessible without data rehydration. You can locate VMs and files instantly with Google-like search. Just type a few letters into Rubrik’s predictive search engine, and you’ll get served results instantly.
In this post, I will describe how Rubrik archives data and makes data rehydration fast and efficient.
We have jobs running per VM that archives snapshots depending on the configured SLA policy for that VM. When an upload job kicks in, it figures out whether it needs to upload a full snapshot or an incremental snapshot. If it needs to upload a full, then it constructs a full snapshot from the locally stored incremental files. Otherwise, it uploads the diff of the snapshot that needs to be archived with respect to the last snapshot that was archived.
Occasionally, we upload fulls to avoid long chains of deltas on archive. Having disjoint chains on archival store help us delete old snapshots and save space.
Let’s look at an example.In this scenario, first full to I5 form one chain. Following certain heuristics, we decide to upload another full snapshot and hence, second full to I10 in order to form another snapshot chain.
To understand how snapshot download works, let’s first understand how these snapshots are stored. Each snapshot disk is maintained as a sparse file, maintaining just the sections of a disk that have changed in a particular snapshot. The operation of snapshot reconstruction involves applying deltas in chronological order. For VMs with a very high change rate, applying deltas can be a very expensive operation. Rubrik solves this problem by independently maintaining changed block metadata for each snapshot. In order to read a specific block of data, Rubrik consults the changed block metadata for each snapshot making up a chain and reads the most recent block.In the example shown above, we have 4 snapshots. First one is a full snapshot having all 8 data blocks and remaining three are incremental that have the changed data blocks only. A total of 18 data blocks completely describe all the snapshots of the Virtual Disk. In order to download any specific point in time snapshot, we only need to download 8 data blocks. This saving in downloaded data can be significant when downloading a huge snapshot and specially, one with a high change rate.
One of the key recovery functionalities that a backup solution must provide is single file restore. For example, a 10TB file-server may be backed up to the cloud, and the customer would like to restore a single 100K file from the cloud backup for this fileserver. The Rubrik cloud solution provides an extremely efficient and economical solution to this common recovery scenario. Each piece of data in the cloud is cataloged and indexed; this index being stored within the Rubrik appliance. This index allows Rubrik to address the individual bytes that constitute any point-in-time copy of each file that it backs up. In the example provided above, the file in question can be restored with a net byte transfer roughly equivalent to the file size. This is a huge time and money saving compared to naive cloud storage solutions, which require the download of the entire VM data to restore a single file. Rubrik has the ability to index and restore files across all windows and unix/linux based operating systems – across most of commonly used filesystems.
For more information regarding Rubrik’s cloud archival feature, check out our data sheet on object store archival.