Rubrik allows customers to protect their workloads like VMs, Disks, and SQL instances running on Azure. We have customers who protect a large number of Azure subscriptions through our SaaS product. We offer features like File-Level Recovery to allow customers to make faster recoveries and Storage Tiering to save on storage costs associated with the backups. To support these features, we run compute in the customer’s environment to read the data from Azure Disk snapshots. In our first version, we required customers to set up networking configurations for running Rubrik compute in each protected subscription and region.

Rubrik simplifies customer setup for protecting hundreds of Azure subscriptions with minimal setup. We built a FUSE filesystem leveraging Azure REST APIs to read the snapshot data. This allowed us to use Rubrik’s compute instances running in one subscription for workloads across multiple subscriptions. With this simplified flow, customers are able to streamline set up operations quickly. 

Problem & Ideation

For reading data from a snapshot, a typical mechanism is to launch a disk from the snapshot. The launched disk can be attached to a virtual machine which can be used to read the data. One of the limitations of this approach is that a disk can only be attached to a virtual machine running in the same subscription and region. This led to the following problems:

  • Customers need to make the required network configuration changes to run Rubrik Exocompute in each subscription and region. Some customers do not prefer making these changes in subscriptions running their critical workloads.

  • It does not scale well when a customer has over 100 subscriptions. It could be a tedious and time-consuming task for the customers to make the networking changes in each of these subscriptions and regions.

We brainstormed multiple ideas to simplify this and one of the promising ideas was to deploy the compute resources into one subscription and use it for workloads across all subscriptions protected by Rubrik. This would simplify the setup for the customers. They would also be able to set up a separate subscription dedicated to running Rubrik Exocompute without making any changes to the subscriptions running their critical workloads. To achieve this, we needed a way to read data of a snapshot present in one subscription from Exocompute running in a different subscription. 

Reading the snapshot data through APIs

Azure exposes REST APIs to read a snapshot via a Shared Access Signature URL. We could use this API to read the snapshot data from a compute instance running in a different subscription. This eliminates the need of launching the disk and attaching it to the compute instance, which was the fundamental reason for running compute in the same subscription. 

To understand how we can use these APIs to read a file from the snapshot, let's first look at how we read a file from a disk attached to the compute instance. We first need to detect and mount the filesystem(s) present in the snapshot. Subsequently, we can make system calls to open and read a file. In Unix-like systems, the filesystem first fetches the file inode from the inode table. The inode contains the metadata about the file and the data blocks where file contents are stored. The filesystem will then read the appropriate data block(s) to serve the read request. 

The Azure REST APIs allow reading the snapshot data at any offset. However, as we have seen above, reading a file translates to reading from multiple offsets for reading inode as well as reading the actual file content. Rubrik supports multiple filesystems and we didn’t want to reverse-engineer each filesystem to determine which offsets to read. That’s where we leverage FUSE filesystems to create virtual devices.

Using FUSE to create virtual devices

With a lack of actual disks launched from the snapshot, we use the following mechanism to mount and read from the filesystem:

  • Create a FUSE filesystem that exposes the snapshot data as a VHD file. Any read requests coming to this file are served by making REST API calls to read the data from the snapshot at an appropriate offset.

  • Create a loop device to expose a block device backed by this VHD file. A read call to read data from the loop device translates to a read call from the VHD file.

  • Use the loop device to detect and mount the filesystem. The complexity of making the API calls to read the data is abstracted behind the loop device. For the application trying to mount the filesystem and reading a file from the mounted filesystem, it will be similar to using a device attached to the instance. 

Using FUSE ensured that we do not need to change our application logic for reading from these snapshots and we could handle all the different filesystems. The same setup can be extended for a VM backup consisting of snapshots of multiple disks.
 

img

Scalable, Faster, and Cheaper

We get the following benefits from using the APIs to read the snapshot data and sharing the compute for workloads across multiple subscriptions:

  • Simplifies the setup for our customers to scale for the protection of large number of subscriptions.

  • Avoids the need of launching a disk from snapshots which saves the time it takes to launch a disk as well as the costs associated with the disk.

  • Shared compute instances allow better utilization of resources. We use an auto-scaling Kubernetes cluster as our compute instance so generating more work on a single cluster is not a problem.

Conclusion

Simplicity is at the core of Rubrik Security Cloud. This solution shares compute across subscriptions, allowing our customers to protect hundreds of their Azure subscriptions in one go. 

At Rubrik, engineers are always encouraged to come up with their own ideas for further enhancing our industry-leading Rubrik Security Cloud platform. This was one such example where our engineering team created a tangible solution to further simplify and solve critical pain points for our customers.

Learn more about how you can make a real impact on our customers by visiting our career site and learning more about Rubrik’s culture, values, and opportunities to develop your career.