Tagged in

cerebro

Rubrik -  - Intelligent Data Protection: Revisiting Cerebro

Architecture

Intelligent Data Protection: Revisiting Cerebro

Making any sufficiently complex system look and feel simple is a tall order. But that’s precisely what Cerebro does for Rubrik Cloud Data Management! As the “brains” of the stack, Cerebro acts as the autonomous conductor standing on a podium before thousands of critical systems, all eager to be protected or restored as part of the data lifecycle management symphony. Founding Engineer Fabiano Botelho introduced Cerebro in a blog post over two years ago. The time is nigh to dig deeper. The power of Cerebro allows workload data to be freed from the storage tier by unlocking mobility beyond the data center — into the cloud and between different clouds. Cerebro is the system’s brains, accommodating many critical functions of the Rubrik CDM stack. Two of those functions are the Distributed Task Framework and Blob Engine, which together unite as powerful components to ensure Rubrik delivers data that is immediately accessible and recoverable. Distributed Task Framework The Distributed Task Framework is the engine responsible for globally assigning and executing tasks across a cluster in a fault tolerant and efficient manner. It has the intelligence to provide resource utilization and load balancing of all data in a declarative manner. Rubrik’s Distributed Task…
Rubrik -  - Meet Cerebro, the Brains Behind Rubrik’s Time Machine

Architecture

Meet Cerebro, the Brains Behind Rubrik’s Time Machine

Fabiano Botelho, father of two and star soccer player, explains how Cerebro was designed. Previously, Fabiano was the tech lead of Data Domain’s Garbage Collection team. Rubrik is a scale-out data management platform that enables users to protect their primary infrastructure. Cerebro is the “brains” of the system, coordinating the movement of customer data from initial ingest and propagating that data to other data locations, such as cloud storage and remote clusters (for replication). It is also where the data compaction engine (deduplication, compression) sits. In this post, we’ll discuss how Cerebro efficiently stores data with global deduplication and compression while making Instant Recovery & Mount possible. Cerebro ties our API integration layer, which has adapters to extract data from various data sources (e.g., VMware, Microsoft, Oracle), to our different storage layers (Atlas and cloud providers like Amazon and Google). It achieves this by leveraging a distributed task framework and a distributed metadata system. See AJ’s post on the key components of our system. Cerebro solves many challenges while managing the data lifecycle, such as efficiently ingesting data at a cluster-level, storing data compactly while making it readily accessible for instant recovery, and ensuring data integrity at all times. This is what…