In last week’s post, Kenny explained how we designed Rubrik to eliminate the effects of VMware application stun. We couple flash with a distributed architecture to deliver faster ingest that linearly scales with cluster growth. We reduce the number of data hops by collapsing discrete backup hardware/software into a single software fabric. We tightly manage the number of operations hitting the ESXi hosts to speed up consolidation. Our own VSS Provider also contributes to this effort.

In this week’s post, Part 2 of our App Consistency series, I’ll explain why we built our own VSS agent and how we take app-consistent snapshots. Maintaining application and data consistency is industry standard practice for any backup solution worth its salt. To backup transactional applications installed on a Windows server (SQL, Exchange, Oracle), we utilize Microsoft’s native Volume Shadow Copy Service (VSS). Taking an application-consistent snapshot not only captures all of the VM’s data at the same time, but also waits for the VM to flush I/O operations and transactions in process.

We Hate Bad Days and Sleepless Nights

Failed backup jobs are a leading cause of bad days and sleepless nights, which is why we took extra care to mitigate risk factors when protecting mission-critical apps. To minimize errors like guest crashes or orphan snapshots, we opted to build our own VSS agent that doubles as both requester and provider during the VSS coordination process. Our VSS agent provides:

  • Hands-free management – Once we detect a Windows VM, we auto-deploy our VSS agent via VMware tools. There’s no manual install and un-install. Just think of the cleaners who you trust to have keys to your place. They let themselves in, clean, and then let themselves out.
  • Rightly timed snapshots – Since our VSS agent acts as an VSS provider, we know the right moment to trigger a consistent snap. Our VSS provider keeps an eye on when applications are quiesced, and triggers the consistent snapshot to happen within Microsoft’s hardcoded 10 second timing window.
  • Copy-only backups for SQL – We automatically take copy-only backups for SQL server, which is basically a snapshot of SQL at that point in time and independent of any backup sequencing that you may have. Taking a copy-only backup avoids breaking the SQL server backup chain.
  • App-consistent log truncation for Exchange – We automatically check to see if logs are backed up, and if done, truncate the logs up to the nearest point when the backup commenced. We don’t require specific settings to be turned on by the user to ensure app-consistent log truncation. Log truncation prevents the VM from running out of disk space and eventually crashing.

How Our VSS Service Works

Our VSS requester and writers coordinate to provide a stable system image from which to back up data. Once the requester queries the writers for information about the files to be backed up, the Windows VSS service quiesces all writers and freezes I/O operations. Our VSS provider instructs the ESXi host to take a VMware snapshot. Following the backup, the Windows VSS service resumes normal operations and confirms with our VSS provider that the VMware snapshot was successfully taken.

img

To wrap it up, the main point is we inject industry best practices with automation, hands-free management, and efficiency to deliver app-consistent snapshots for your mission-critical apps. For more information, check out our App-Consistent Snapshot brief here.