The advantage of automation for archiving your protected data in storage tiers is really about one thing – optimizing storage consumption costs. Storing data in Azure Storage Archive or Cool tiers saves costs for enterprises but requires time and effort to retrieve or move that data.
Many Rubrik customers store snapshots, or point-in-time images of their workloads, in the public cloud using blob storage. Storage tiering in Azure requires the use of General Purpose v2 (GPv2) storage accounts.
Here is a table of the General Purpose v2 storage tiers:
* Data housed in the Premium tier cannot be migrated to any other tiers currently
Let’s dive a bit deeper into the use cases for utilizing this tiering. For example, suppose you have an audit requirement to store certain year-end financial data for up to 24 months and make the data readily accessible for up to 6 months. That data will be initially stored in the Hot tier for the first 6 months, then moved to the Archive tier, where it will remain until it expires in 17 months and is finally retired.
Example use case for using Archive storage tiering
In this scenario, tiering is optimal for satisfying retrieval and archive requirements, and is also very cost effective since the data only lives in the most expensive Hot tier for 6 of the total 24 months. By moving it to the Archive tier for longer retention, it’s consuming offline storage at a lower cost. Microsoft estimates that by moving infrequently accessed data to the Archive tier, monthly per GB storage costs could be reduced by up to 95%.[1]
Providing an intelligent lifecycle management solution to tier data in the public cloud that doesn’t require manual API requests is important for saving costs. Fortunately, this is a native feature when using Rubrik. Let’s walk through two automated storage tiering options available with Azure that include intelligent lifecycle data management and assists in optimizing your storage consumption: Instant Tiering and Smart Tiering.
Instant Tiering
Instant Tiering occurs when a snapshot is created for a protected object. The data is sent to the default account tier, which Rubrik recommends be the Azure Hot tier, and is then automatically moved to the Archive tier. The Archive tier copy of the snapshot has no correlation to the local cluster snapshot and is independently managed, so it is left unchanged. When the API to place data directly into the Archive tier has been exposed by Microsoft, the data will be sent directly to the Archive tier.
A great use case for this would be if you have a large amount of employee data that must remain readily accessible for immediate retrieval, but you are also required to store an additional copy in a long-term retention archive for legal purposes. The initial snapshot would be moved into the Hot tier temporarily and then moved to the Archive tier.
Instant Tiering example with the Hot tier set as the default account tier
This feature is enabled by checking the “Archive Access Tier Only” checkbox in the Rubrik SLA Domain policy.
The Instant Tiering and Smart Tiering options in Rubrik
Smart Tiering
Rubrik takes intelligent data management with Azure archiving even further with Smart Tiering. This technology takes advantage of a sophisticated data lifecycle management task that runs to ensure the snapshots are being moved to the optimal storage tier. The audit example illustrated earlier is a good use case for Smart Tiering.
When enabling an SLA Domain’s Smart Tiering feature, it’s also necessary to set a minimum accessible duration for data sent to the archive. This measures, in days, how long data must remain in the default tier to support Instant Recovery activities. The advantage to this is not having to worry about incurring a penalty on your Recovery Time Objective (RTO) values since the minimum accessible duration will safeguard away any tiering requests.
Compared to Instant Tiering, Smart Tiering gives you the flexibility to create a complete lifecycle for data by specifying the total time duration before the snapshots are eligible to be automatically moved into the Azure Archive tier. To do this, we validate that the minimum accessible duration has been reached and that the snapshot is going to exist long enough to be worth tiering. We don’t want to tier the data and violate your needs for an Instant Recovery and a lower RTO. Additionally, if the SLA Domain is configured to retire the snapshot before it would reach the age required for the Azure storage access tier, it’s typically better to leave it alone.
This is a fully automated and intelligent process that can be broken down into these steps:
- Validates eligibility with algorithms that compute time restrictions and snapshot aging, and marks the snapshot with a Cool or Archive tier flag.
- Ensures that archive and retention durations specified in the SLA Domain policy are consistently followed no matter where the data is stored.
- Prevents tiering if it’s determined that the costs to store in the default tier are lower than the combined costs of storage in the Archive tier, and compute costs of constructing the reverse chains, as described in the next section.
- Analyzes snapshot chains and, if required, takes action on them by eliminating any dependencies.
- Factors in and adjusts for the minimum accessible duration times per tier so that the impact of any data retrieval penalties is minimal. You can read more about those penalties in this Microsoft documentation[2].
Blob Chains
Within Rubrik, snapshots are archived as forward chains of incremental data blobs called blob chains. These blob chains may incur dependencies between the blobs depending on how the SLA Domain is configured for Archiving. When tiering to the Azure cloud, it would not be responsible to arbitrarily move the blobs of any older snapshots that have dependencies to the Archive tier, as this would block the ability to retrieve recent snapshots and certainly would not be cost effective.
To resolve this, Rubrik first checks to see if any of these chain dependencies exist, and if found, a transient compute instance called Bolt is spun up and eliminates them. Once the chain is successfully reversed, the snapshots are again eligible for tiering.
Chain reversal process to remove dependencies in the Azure Archive tier
As achieving optimal storage costs of archiving data in Azure is the primary goal of using Smart Tiering, Rubrik has built-in intelligence that compares the cost of the default tier (Hot or Cool) with the cost of both the Archive tier and the compute cost to run the chain dependency reversal process in Bolt. If Rubrik determines that the cost would be lower to have the snapshot remain in the default tier versus the storage and compute costs to move it to the Archive tier, it will not tier the snapshot.
Rehydration
Rubrik advises that you configure workloads for Smart Tiering that have long-term retention or a mandated requirement to be pushed to the Archive Tier in the cloud. Rehydration of data from the Archive tier can take up to 15 hours per Microsoft documentation[2], and any data that is retrieved before the 180 days minimum accessible duration time has expired will incur a prorated cost penalty. Based on those same conditions, it would also be prudent to keep infrequently accessed data in the Cool tier as close to a 30-day minimum period.
To enable Smart Tiering in a Rubrik SLA Domain, select this option and enter the desired duration period. Once committed to the SLA Domain policy, all protected objects assigned to that policy are automatically enabled for Smart Tiering. They are evaluated every 6 hours to determine the age of their snapshots and, if eligible for tiering, execute lifecycle management. When it’s time to rehydrate the data by restoring or downloading snapshots, Rubrik initiates a task that traverses the blob chain to rehydrate all the blobs in a parallel sequence. The advantage is when Rubrik executes on these APIs via a backup or restore, only the snapshot patch files are moved to Archive, whereas the metadata is kept in the Hot tier, allowing for ready access to indexing and fingerprinting.
Conclusion
The intelligence and logic that is built into Rubrik’s Instant Tiering and Smart Tiering technologies will help solve common problems and provide cost savings in the public cloud. Smart Tiering is the best option for a complete automated data lifecycle solution, and Instant Tiering allows for getting a snapshot to the Archive tier for long-term retention. Rubrik makes moving and tiering snapshots to Azure simple and efficient while also giving you more control over your cloud storage consumption and costs. That is smart data management.
[1]Microsoft Azure Storage blog 8-5-2019
[2] Archive Blob Storage: Hot, Cool, and Archive Access Tiers
[3] Archive Blob Storage: Hot, Cool, and Archive Access Tiers