The ability to archive data to Amazon S3 has been part of the Rubrik platform since our very first product release. This feature, CloudOut, automates lifecycle management by archiving older, mostly unused data to public cloud storage tiers. Using public cloud is often more cost effective and reliable than tape for long-term data retention and has been widely adopted by Rubrik customers.
More details about designing a CloudOut to S3 solution can be found in this blog post.
Preparing your AWS environment for CloudOut is not necessarily difficult, but leveraging automation enables consistent configuration when there are several archive locations. Automating the process utilizing the tools described below not only decreases the chance of error and saves valuable time, but also helps prevent misconfiguration and supports a more reliable and consistent environment.
For these reasons, many of our customers automate the configuration of their AWS environment for CloudOut. The Rubrik Cluster’s archival settings can also be configured using an automation tool.
This blog post will cover automating CloudOut to S3 resource configuration using the tools our customers most commonly use: AWS CloudFormation, Ansible, and Terraform.
Automation with AWS CloudFormation
AWS CloudFormation provides a common language to model and provision all resources across your entire AWS environment. A model is defined within a template and used to provision applications and resources in an automated, repeatable, and consistent manner. Customers input parameter values, and AWS CloudFormation determines what resources to create, modify, or delete based on template logic. The following image depicts the standard resource collection processed by the AWS CloudFormation Template for Rubrik CloudOut.
As an AWS native-service, CloudFormation provides configuration drift detection and automatic roll-back if errors are detected. This makes AWS CloudFormation a natural choice for customers looking to manage CloudOut configuration as a collection of related resources within an AWS account. This includes the option of creating new or using existing resources, as shown in the following image.
Unlike the other tools described in this post, AWS CloudFormation is limited to use only within AWS and cannot be used to automate the creation of the Archival Location on the Rubrik Cluster. That said, once complete, the CloudFormation Stack will display all of the information needed to configure the Archive Location manually on the Rubrik Cluster within the Outputs tab, as shown below.
Another benefit of using a native service is that resources created by AWS CloudFormation will not be removed, even if the CloudFormation Stack is deleted. This provides a safety net that prevents users from inadvertently removing their archived backups.
As always, customers should carefully examine their security policies and include appropriate hardening configurations in the provided template. You can find a number of recommendations and hardening guidelines within the Security Hardening Rubrik CloudOut for AWS technical white paper. A more detailed walkthrough for using the CloudFormation Template for Rubrik CloudOut can be found here.
Automation with Red Hat Ansible
Unlike AWS CloudFormation, Ansible can provision and configure both on-premises resources, as well as resources in AWS, making it a strong choice for customers looking to standardize on a single tool across their hybrid cloud environment. Ansible has the capability to perform all of the individual steps to configure your AWS environment, as documented in the Use Ansible to Configure CloudOut to AWS S3 use case.
CloudFormation is a capable tool, but it has some downsides. As previously mentioned, CloudFormation is limited to use within AWS, and it lacks some advanced functionality available in similar tools. Luckily, Ansible eases this pain by allowing users to programmatically run CloudFormation stacks as simply as they can configure their on-premises infrastructure. This also preserves the safeguards that are included in the CloudFormation Stack template, truly providing the best of both worlds. Using CloudFormation to prepare AWS for CloudOut, as well as configuring the CloudOut archive location on your local Rubrik cluster, can be performed in a single Ansible playbook. A sample playbook can be found here.
This playbook is comprised of a handful of variable definitions and two tasks, one that creates and runs the necessary CloudFormation stack, and the other that configures a Rubrik CDM cluster archive location. The values returned from the completed CloudFormation configuration are passed along to the second step so Ansible has all of the necessary information to configure the cluster.
Let’s look at authenticating to AWS and Rubrik.
Credentials for both AWS and Rubrik could be stored directly in the playbook, but saving sensitive data in a plaintext file will make a hacker’s job very easy, so this method is discouraged. This example uses credentials stored in environment variables, which are ephemeral and much harder for nefarious entities to steal. The following outlines the environment variables you will need to set for authentication:
- AWS_ACCESS_KEY_ID – Access Key for the AWS IAM account with permissions to create, run, and update the AWS CloudFormation Stack. This is used for programmatic access AWS services.
- AWS_SECRET_ACCESS_KEY – Corresponding Secret Access Key for the AWS IAM account
- rubrik_cdm_node_ip – IP address a Rubrik node
- Rubrik credentials, either username (rubrik_cdm_username) and password (rubrik_cdm_password), or API token (rubrik_cdm_token)
NOTE: Use the AWS IAM credentials with permissions to create and execute the AWS CloudFormation stack. Running the stack creates a new IAM user, including new Access and Secret keys. Information on saving the newly created keys is detailed in the “Saving IAM Keys” section below.
Our Ansible Quick Start contains examples for setting environment variables in both Windows and Linux/macOS environments, should you need a refresher on how to do this.
Resources details are provided by defining a few variables that will be used later in the Ansible playbook:
AWS CloudFormation Stack Creation
Once the environment variables are configured, you are ready to use AWS CloudFormation to create the necessary cloud resources. The required values are the same as those discussed in the “Automation with AWS CloudFormation” section. The tags listed in the playbook below are examples and should be customized to match the tagging strategy you are using in your environment.
Once the task completes, the output is stored in the cloudformation_results variable.
Archive Location Configuration
Once the AWS resources are created, the Rubrik cluster is configured to use the Amazon S3 bucket as a new Archive Location. After AWS CloudFormation creates the cloud resources, a number of values, such as the KMS ID, are returned. The returned values are referenced in the rubrik_aws_s3_cloudout module parameters to complete the cluster configuration.
Additional documentation for this module is available on GitHub.
Saving IAM Keys
Optionally, a copy of the AWS CloudFormation created Access and Secret keys for the IAM user can be saved in your working directory. This file, iam-keys.txt, contains sensitive information and should be closely guarded. Consider using a secrets management tool to store the information contained in the file and remove the file afterwards.
Using Ansible to drive the configuration of CloudOut is a straightforward and efficient way to prepare your environment for archiving. All an administrator has to do is configure a few variables at the beginning of the playbook, run it, and monitor the output. After a moment, they’re free to configure an SLA Domain to use the newly created CloudOut archive location.
Automation with HashiCorp Terraform
Terraform is a popular Infrastructure as Code tool, known for its simple syntax and extensive cloud provisioning capabilities. Rubrik offers a Terraform Module, rubrik-s3-cloudout, to prepare an AWS environment for CloudOut, and configure a new archive location on Rubrik. This module simplifies configuring both AWS and Rubrik, and requires very little prior experience with Terraform.
Documentation for authentication to AWS with Terraform is here, and authentication to Rubrik is here. You will use the same environment variables for authentication with Terraform as you do with Ansible. The only difference between the Ansible and Terraform use cases is that Terraform requires that a username/password combination must be used to authenticate to Rubrik, as opposed to an API token.
Using the rubrik-s3-cloudout module is as simple as creating a Terraform HCL (.tf) file similar to the one below.
Once you’ve configured your environment variables and created a main.tf file similar to the example, run terraform init to initialize your working directory. Next, run terraform plan, and review the output to verify that it matches your expectations. If the plan output is acceptable, run terraform apply to configure AWS and Rubrik for CloudOut. All of the configuration built by Terraform can be removed with terraform destroy, but use caution when performing this operation. When running terraform destroy with bucket_force_destroy set to true, your S3 bucket will be permanently removed, along with any archived backups contained in it.
Saving IAM Keys
Set save_keys to true to save a copy of the Access and Secret keys for the AWS IAM user that was created by Terraform. The keys are saved in a file named iam_keys.txt within your working directory. The same security principles apply to this file as mentioned previously. Use a secrets management tool to store the information contained in the file, and remove the file once the information has been safely stored.
Note that the IAM Access and Secret keys will be stored as plaintext in your terraform.tfstate file. Use security best practices to restrict access to this file. More information on securing your Terraform state file can be found here.
The power of Rubrik and AWS APIs combined with popular automation tools ensures a reliable and consistent deployment. Infrastructure as Code methodology provides superior operational models when provisioning CloudOut resources for your environment. Stay tuned for future posts that provide examples for automating other CloudOut locations, like Azure, GCP or local NFS storage. Remember to visit Rubrik Build to explore additional automation tools and use cases!