Tagged in

engineering

Rubrik -  - Scala: Concise, Clean Code for Humans

Architecture

Scala: Concise, Clean Code for Humans

Let me ask you a simple question: which do you think is a more natural way of thinking? I am going to go home and take a nap. My present location is “office,” and my state of wakefulness is “awake.” I am going to change my location to “home” and then change my state of wakefulness to “asleep.” The answer is probably a unanimous and resounding “the first one!” But when we write code, it is almost always an example of the second one. Here at Rubrik, while expanding the frontiers of Cloud Data Management, we are also passionate about the psychology of programming and shortening the learning curve for new developers. So, we look for innovative methods to reduce the cognitive load that programmers deal with. That’s where Scala’s magic shines! In this post, I am going to walk you through how we leverage Scala’s expressiveness to write cleaner, leaner, and more meaningful code. For this example, we’ll write a simple simulation for modeling backup operations and how they consume space. For starters, let’s simulate, using a toy program, what happens to occupied storage space when we take a snapshot: [crayon-5c148574838b5277316909/] No code is done without unit tests, right?…
Rubrik -  - Erasure Coding or: How Rubrik Doubled the Capacity of Your Cluster

Architecture

Erasure Coding or: How Rubrik Doubled the Capacity of Your Cluster

At Rubrik, we’re big believers in data protection. But until we’re able to take consistent snapshots of our brain state and upload them to the promised hierarchical neural interconnect, we’re going to focus on backing up the more traditional machines — the ones whose smooth functioning will enable this cause. Any complete backup solution needs a distributed, scalable, fault-tolerant file system. Rubrik’s is Atlas, which made the switch from triple mirrored encoding to a Reed Solomon encoding scheme during our Firefly release. To help you understand the motivation behind this change, this post introduces erasure coding and compares the two methods. What is Erasure Coding? Suppose we want to store a piece of data on a fault-tolerant and distributed file system. In this case, the loss of any single drive should not result in data loss. The only way to achieve fault tolerance is through redundancy, which refers to storing extra information about the data across different drives to allow for its complete recovery in the event of a failure. The more redundancy we add, the greater the fault tolerance. However, the cost of redundancy is increased storage overhead. Every file system needs to make this tradeoff between availability and overhead. At Rubrik, the…
Rubrik -  - Introducing Crystal, Rubrik’s Intuitive User Interface

Product

Introducing Crystal, Rubrik’s Intuitive User Interface

Rubrik reinvented data management with the user in mind. By approaching data management from a user’s perspective, we’ve distilled a complex process into a click or swipe. Setup a policy to backup data from multiple sources with a user experience similar to an iPhone. Create a replication and archival schedule to public or private cloud in 30 seconds. Recover entire virtual machines, databases, and files instantly. This is end-to-end data management. Simplified. The Consumer Experience of Enterprise Crystal is Rubrik’s intuitive data management platform that makes all of the above scenarios a reality. It transforms complicated enterprise backup, disaster recovery, data archival, and copy data management workflows from a burden to a joy. One of the reasons we built Rubrik was to bring a consumer-grade experience to enterprise software. To provide the simplicity, easy of use, and pain free delight of popular consumer products, such as Facebook, Google, and Dropbox, we created a team composed of engineers from both consumer and enterprise, including the builders of Google Maps, Apple iOS, and Box.What is Crystal? What is Crystal? Crystal is composed of two principal components: the Crystal UI and Crystal REST API. The Crystal UI focuses on building products with usability…
Rubrik -  - 10 Reasons Why You Should Intern at a Startup

Culture

10 Reasons Why You Should Intern at a Startup

All of the software engineering students out there are crazy about internships/jobs at top tech giants — Google, Facebook, Microsoft, etc. Have you ever considered interning at a startup? If not, then read on. This post might change your perspective about startups.I interned at Rubrik. Rubrik is a backup and storage startup based in Palo Alto in California, USA. They are a team of about 50 engineers who are top notch in their respective fields. The engineers at Rubrik are super talented and they have experience of building the real tech. I interned at Rubrik. Rubrik is a backup and storage startup based in Palo Alto in California, USA. They are a team of about 50 engineers who are top notch in their respective fields. The engineers at Rubrik are super talented and they have experience of building the real tech.I will list out the things that I really liked about Rubrik: I will list out the things that I really liked about Rubrik: Freedom to choose project: I was initially given an option to choose from 1 of the 2 suggested projects. I didn’t quite like both of them and so, the team members helped me to come up…
Rubrik -  - Meet Cerebro, the Brains Behind Rubrik’s Time Machine

Architecture

Meet Cerebro, the Brains Behind Rubrik’s Time Machine

Fabiano Botelho, father of two and star soccer player, explains how Cerebro was designed. Previously, Fabiano was the tech lead of Data Domain’s Garbage Collection team. Rubrik is a scale-out data management platform that enables users to protect their primary infrastructure. Cerebro is the “brains” of the system, coordinating the movement of customer data from initial ingest and propagating that data to other data locations, such as cloud storage and remote clusters (for replication). It is also where the data compaction engine (deduplication, compression) sits. In this post, we’ll discuss how Cerebro efficiently stores data with global deduplication and compression while making Instant Recovery & Mount possible. Cerebro ties our API integration layer, which has adapters to extract data from various data sources (e.g., VMware, Microsoft, Oracle), to our different storage layers (Atlas and cloud providers like Amazon and Google). It achieves this by leveraging a distributed task framework and a distributed metadata system. See AJ’s post on the key components of our system. Cerebro solves many challenges while managing the data lifecycle, such as efficiently ingesting data at a cluster-level, storing data compactly while making it readily accessible for instant recovery, and ensuring data integrity at all times. This is what…