In the cloud computing era, organizations face new challenges in data protection and cost management. The way that traditional backup solutions archive cloud data often leads to escalating storage costs and inefficiencies. These solutions can require multiple full backups to maintain acceptable restore performance, resulting in significant storage and operational overhead. Additionally, the linear nature of traditional snapshot chains creates problematic dependencies between backups, often preventing the deletion of expired snapshots and leading to unnecessary data retention and inflated storage costs.

Rubrik Optimized Snapshot Retention addresses these challenges head-on, redefining cloud data archive and retention. This innovative feature introduces a more efficient, hierarchical approach tailored for the cloud age. Rubrik customers have used the solution to achieve lower total cost of ownership (TCO), improve storage efficiency, and optimize data protection performance in cloud environments. Here’s how.

What’s wrong with the status quo?

Why should you even care about how backup vendors are writing data to disk? It boils down to two main problems: storing multiple full backups and linear backup chain dependencies. 

The burden of multiple full backups

While most backup solutions do provide an incremental forever approach, they often break this promise in order to maintain restore performance and chain reliability. After all, a key component to any data protection strategy is recovery time objective (RTO) and a successful restore performance sits at the heart of that. For instance, let’s take a look at the following diagram that depicts a common incremental forever, linear snapshot retention strategy employed by many backup vendors.

 

Long Chain

 


As we can see, we have one full backup (D1) and multiple incremental backups (D2 to D90). While this incremental forever approach is optimal for keeping an organization's storage costs low, when the time comes to restore performance problems can sometimes emerge due to the sheer amount of consolidation required. 

To help with this, traditional vendors will sometimes employ a consolidation technique beforehand, merging various point-in-time backups together. This helps with performance and hints at a solution that would allow previously expired backups to actually be purged from disk. Inside a data center this process makes sense, as we often have a surplus of compute power. But an expensive, lengthy, and resource-intensive process needs to occur to consolidate or merge incremental and full backups together. Simply uplifting a traditional process like this to the cloud is a big mistake—in cloud, compute costs you money, so this would  only increase overall TCO.

To combat this, many backup vendors take periodic full backups to keep snapshot chains at a manageable length and avoid the need to consolidate.
 

 

Periodic


We can see from the diagram that we still have a full backup (D1) to begin with, followed by incrementals (D2-D60). However, we break the chain by creating new full backups (D61 and D121) with their respective incremental backups. This approach does increase restore performance and eliminates the need to consolidate. But now we are storing multiple full backups to cloud storage space, which drives costs up and adds the time and resources required to take multiple full backups. The result: higher TCO.

Inability to expire backups

Another significant challenge in traditional backup systems is the rigid dependency chain created by linear snapshot sequences. In these systems, each incremental backup relies on previous backups in the chain, creating a complex web of dependencies. As shown below, this interdependency becomes problematic when you want to expire and delete older backups. Even when specific point-in-time snapshots (D2-D7) have exceeded their intended retention period of 7 days, they often cannot be safely removed from the system. The reason? Deleting these snapshots breaks the entire chain, potentially causing complications to restore processes.
 

Pinned Snapshots


This limitation forces organizations to pay for storage of outdated, unnecessary data. Storage costs continually escalate as data management becomes increasingly complex. It also complicates compliance with data retention policies and forces systems to process larger volumes of redundant data. 

Optimized Snapshot Retention: Balancing Cost and Performance

Rubrik Optimized Snapshot Retention reimagines the incremental forever approach, allowing for more flexible and efficient data retention and deletion practices. Let’s take a look using the example below, which adheres to an SLA domain with the following constructs:

  • Backup once per day, retain for 7 days

  • Backup once per week, retain for 4 weeks

  • Backup once per month, retain for 1 year
     

Way1


As shown above, Rubrik’s Optimized Snapshot Retention begins with a full backup (D1), like other approaches. Subsequent backups are always incremental, using the previous backup as the base for incremental changes (D2-D7). When the time comes to initiate a new retention frequency (in this case weekly) a new incremental chain (W1) is built off of the base full backup (D1).  Also, since D8 and W1 both fall on the same date, Rubrik adheres to the highest frequency and only requires a single backup to satisfy both frequencies. As daily incrementals continue (D9-D14), they are then based on that new weekly chain (W1). This keeps the chain length down to a very manageable level and allows for the expiry and deletion of the older snapshots (D2-D7), as they have no dependency on the weekly chain.
 

Way2


Now, new chains are constructed using incremental backups based on D1 as new frequencies are encountered. As shown, we can see the W2, W3, and W4 chains, along with their subsequent daily backups being created and expired. Once we hit our monthly frequency, again, a new chain is created (M1), which is now the base for future weekly chains. 

Rubrik continuously monitors chain length and change rate and may periodically branch new hierarchies off of the initial full in order to maintain both performance and retention. This hierarchical, tree-based retention strategy is continued throughout the entire lifecycle of the data, providing both fast restore performance (due to manageable chain lengths) and the ability to purge expired data (due to the dependency-break of the frequency chains). The result: less need for cloud storage and lower cloud storage bills.

Optimized Snapshot Retention: A Lower Cost, Hierarchical Approach to Cloud Backup Storage

Rubrik's Optimized Snapshot Retention represents a significant leap forward in data protection technology. By reimagining the traditional snapshot chain into a more efficient hierarchical structure, Rubrik has addressed key pain points in cloud data backup and retention strategies.

The benefits of this innovative approach are clear: substantial reductions in storage costs, improved storage efficiency, and enhanced data restoration performance. With the potential to save 20-30% on storage costs and significantly accelerate backup and recovery operations, Rubrik's solution offers a compelling value proposition for businesses of all sizes.

The journey towards more efficient, cost-effective data protection is ongoing, and Rubrik Optimized Snapshot Retention is undoubtedly a significant milestone on this path. As we look to the future, it's clear that innovations like these will continue to drive the evolution of data protection strategies, helping businesses safeguard their most valuable asset—their data—more effectively than ever before.