Data Protection for DataStax Enterprise Search Indexes and Databases

Rubrik - Data Protection for DataStax Enterprise Search Indexes and Databases - Data Protection for DataStax Enterprise Search Indexes and Databases

DataStax supports a popular feature called DSE Search as part of DataStax Enterprise (DSE). DSE Search allows you to find data and create features like product catalogs, document repositories, and ad-hoc reports. It uses Apache Solr in the backend to enable search operations on any existing DSE table by creating a Solr index in DSE. In the event of data loss, administrators need to restore the DSE data and then rebuild the indexes from scratch, a process that  can take days. Using Rubrik Mosaic™, you can now backup and restore DSE Search indexes and their data at wire speed, saving you days of application downtime. Let’s dive deeper into how it all works.

At its core, DSE Search comprises the DSE Enterprise database, Apache Solr search interface, and Apache Lucene engine for indexing and search.

When enabled, DSE Search indexes the data distributed on each Cassandra node using Solr and Lucene libraries. Each search node maintains the highly-optimized search index of data stored on that node. The search indexes are stored alongside the data in the Cassandra data directory. These indexes are built incrementally over time as new data is written to the Cassandra node.

In order to utilize the high efficiency search, DSE Search customers need to change their queries to include a ‘magic column’. For example:

SELECT * FROM users WHERE name = ‘adam’;

changes to 

SELECT * FROM users WHERE solr_query = ‘name:adam’; 

In the unfortunate event of a data loss, both the database and corresponding index can be lost. Even if you have older snapshots to recover the databases, upon recovery, the index needs to be recreated from scratch. Apache Solr and Lucene are built for highly-optimized read performance but these libraries are not built for a high speed write performance. That means that recovering from an index data loss scenario can take multiple days. For the business applications, this means extended periods of downtime and degraded performance.

With Rubrik Mosaic, DBAs now have a simple, one-click choice to backup and restore the associated DSE Search index along with the data at a granular table level. At the time of creating a backup policy for DSE, you can optionally click  Enable backups of associated search indexes  in the Mosaic GUI or pass the –backup_index parameter in the CLI/API. On the first run, Mosaic backs up the full indexes along with the data. On all subsequent runs, Mosaic only backs up the incremental changes to the index and data. This reduces time, network bandwidth, and backup storage costs.

Datastax Search Rubrik Mosaic

At the time of recovery, you can simply toggle the button to “Restore associated search indexes” in the GUI or pass the –restore_index parameter in the CLI/API. When enabled, Mosaic restores the whole table(s) data along with the associated search index back to the original DSE ring. This process is completely repair-free and does not require complex, time-consuming database repairs or index recreation. You also have the flexibility to only restore the table(s) data. This option comes in handy in cases where you want to selectively load the data into a test/development Cassandra cluster. 

With version 3.1, Mosaic now also supports DSE 6.0 and 6.7 transparently. You can learn more about the new features in Rubrik Mosaic 3.1, including the availability on the Microsoft Azure Marketplace or join one of the upcoming webcasts.