DynamoDB: Scheduling On-Demand Backups

The recent AWS Reinvent 2017 event saw some major announcements, including a couple of exciting ones for DynamoDB users. Global tables and On-Demand backups. In this post, I’ll be taking a quick look at on-demand backups, how they work and how we can schedule them to regularly take full backups.

AWS Reinvent 2017
AWS Reinvent 2017

What are On-Demand Backups?

On-Demand backups are a feature built into the DynamoDB service (Accessible via the API, AWS Management Console and CLI as usual), which allows you to take a full backup of a table at a point in time.

This task has no impact on performance or availability to your tables. All backups are automatically encrypted, cataloged, easily discoverable, and retained until you explicitly delete them.

Additionally, you can restore these backups to a new table at any point.

Along with data, the following is included in the backups:

  • Global secondary indexes (GSIs)
  • Local secondary indexes (LSIs)
  • Streams
  • Provisioned read and write capacity

The following is NOT included in the backups:

  • Auto scaling policies
  • AWS Identity and Access Management (IAM) policies
  • Amazon CloudWatch metrics and alarms
  • Tags
  • Stream settings
  • Time To Live (TTL) settings

 

What are the Costs like?

Charges vary between regions, but you are charged for storage of the backups only, which means the size of the table you are backing up. The price is approx 40% of standard DynamoDB storage costs. (Approx $0.10-0.11 per GB-month).

How does it work under the covers?

According to the AWS Docs:

When you create an on-demand backup, a time marker of the request is cataloged. The backup is created asynchronously by applying all changes until the time of the request to the last full table snapshot. Backup requests are processed instantaneously and become available for restore within minutes.

The above explains how the backups are taken without impacting performance.

How do you take backups?

  1. Navigate to the DynamoDB Console
  2. Click on the new Backups option under Tables
  3. Click on Create Backup
  4. Select the table you wish to backup, and type in a name for the Backup.
  5. Click Create Backup
  6. Click okay on the success message
  7. You should now see your backup

How do you restore backups?

  1. In the Backups section in DynamoDB Console
  2. Select the backup you wish to restore, and click Restore Backup
  3. Enter the table you wish to restore the backup to. This needs to be a new table.
  4. Click Restore
  5. You should see the new table being restore back in DynamoDB

How do you schedule backups to be taken regularly?

Unfortunately, there is no native functionality that allows you to schedule these backups regularly, but they do recommend you to use CloudWatch Events and AWS Lambda together to trigger these backups. Lets see how we can set this up:

  1. Create a new Lambda Function using NodeJS 6.10 using this code on github.
    1. (As at Dec 3rd 2017) Keep in mind, the native Lambda AWS SDK has not been updated to include the backup and restore APIs. Include node_modules/aws_sdk as part of the Lambda function code.
  2. Use the following IAM Policy for the Lambda function Role
  3. Create a CloudWatch Event Rule
    1. Set a schedule for the backups to run (ie: Every 60 minutes)
    2. Set the target as the Lambda function
    3. Configure the input to be a constant (JSON text).

      1. This allows you to configure backups for tables at independent intervals. ie: Seperate rules for each table / interval.
      2. Pass in constant JSON like this (seperated by comma):
    4. Done! The Lambda should run as often as configured, and take backups of the configured input. You should see backups using the name format: [TABLE_NAME][YYYYMMDDTHHMMSS]

Final thoughts

Although a valid backup option for DynamoDB, it is worth validating whether this solution meets your needs.

Pros

  • No impact on performance
  • Backups taken in very little time
  • No limit on backups
  • Restores are very straight forward
  • Scheduling for backups is possible (including alerts when scheduling fails)

Cons

  • Storage of the backups can get expensive (compared to S3)
  • Backups are full snapshots only (no incremental options)
  • Not available in all regions (yet)
  • You can only restore to a new DynamoDB table
  • No ability to backup to a separate account (Essential for Disaster Recovery in case of security breach)

 

There is news that a new Point in Time backup / restore option will be available in DynamoDB in early 2018, looking forward to seeing that as a more complete solution. My big hope is that it will provide the ability to move the backups to an isolated account, for DR scenarios. Fingers crossed!

HTH

Leave a Reply

Your email address will not be published. Required fields are marked *