Tagged in

app consistency

Rubrik -  - Have Your Cake and Eat It Too – Object-Level Recovery

Product

Have Your Cake and Eat It Too – Object-Level Recovery

In the last few weeks, Kenny covered how we’ve designed Rubrik to eliminate the effects of VMware application stun and JB covered why we built our own VSS Provider to take application-consistent snapshots. In this post, I’ll wrap up the app consistency series by discussing our object-level recovery and search capabilities. Maintaining application and data consistency in data protection requires a “three-layer cake” design. The base layer is our Converged Data Management platform, the next layer is the application tier, and the frosting and sprinkles are the objects/messages/files that are embedded inside the application and database. Maintaining application and data consistency in data protection requires a “three-layer cake” design. The base layer is our Converged Data Management platform, the next layer is the application tier, and the frosting and sprinkles are the objects/messages/files that are embedded inside the application and database. Our platform features a Distributed Task Framework, an engine that assigns and executes data across the cluster in a fault tolerant and efficient manner. It ensures all tasks are load balanced across the cluster and distributed to the nodes that contain the relevant data. This allows us to handle the ingestion of data across multiple nodes in parallel. Adding more…
Rubrik -  - Why We Built Our Own VSS Provider

Architecture

Why We Built Our Own VSS Provider

In last week’s post, Kenny explained how we designed Rubrik to eliminate the effects of VMware application stun. We couple flash with a distributed architecture to deliver faster ingest that linearly scales with cluster growth. We reduce the number of data hops by collapsing discrete backup hardware/software into a single software fabric. We tightly manage the number of operations hitting the ESXi hosts to speed up consolidation. Our own VSS Provider also contributes to this effort. In this week’s post, Part 2 of our App Consistency series, I’ll explain why we built our own VSS agent and how we take app-consistent snapshots. Maintaining application and data consistency is industry standard practice for any backup solution worth its salt. To backup transactional applications installed on a Windows server (SQL, Exchange, Oracle), we utilize Microsoft’s native Volume Shadow Copy Service (VSS). Taking an application-consistent snapshot not only captures all of the VM’s data at the same time, but also waits for the VM to flush I/O operations and transactions in process. We Hate Bad Days and Sleepless Nights Failed backup jobs are a leading cause of bad days and sleepless nights, which is why we took extra care to mitigate risk factors when protecting…
Rubrik -  - Reducing the Impact of Application Stun

Architecture

Reducing the Impact of Application Stun

Application stunning during the snapshot process is a topic that often bubbles up in customer conversations on data protection for VMware environments. To level set, application stun goes hand-in-hand with any snapshot operation. VMware stuns (quiesces) the virtual machine (VM) when the snapshot is created and deleted. Cormac Hogan has a great post on this here. Producing a snapshot of a VM disk file requires the VM to be stunned, a snapshot of the VM disk file to be ingested, and deltas to be consolidated into the base disk. If you’re snapping a highly transactional application, like a database, nasty side effects appear in the form of lengthy backup windows and application time-outs when the “stun-ingest-consolidate” workflow is not efficiently managed. When a snapshot of the base VMDK is created, VMware will create a delta VMDK. Write operations are redirected to the delta VMDK, which expands over time for an active VM. Once the backup completes, the delta VMDK needs to be consolidated with the base VMDK. Longer backup windows lead to bigger delta files, resulting in a longer consolidation process. If the rate of I/O operations exceeds the rate of consolidation, you’ll end up with application time-outs.Rubrik was designed…