DynamoDb Incremental Backups – Part Four

Before we start: If you have missed the previous three posts, please check them out here:

Part One
Part Two
Part Three

At this stage, I’m going to assume you are comfortable with DynamoDb Incremental Backups, and the format they are stored in.

In this post, we will walk through the restore step, and I’ll be first to admit, this can be taken much further than I have. I haven’t had the time, or need to take this as far as I would have liked, but don’t let this stop you! I would love to hear from you if you have done something interesting with this, ie –  automating your DR / backup restore testing.

For our DynamoDb Incremental backups solution, we have incremental backups stored in S3. The data format is in the native DynamoDb format, which is very handy. It allows us to run it out to DynamoDb with no transformation.

Each key (or file) stored in an S3 versioned bucket is a snapshot of a row at a point in time. This allows us to be selective in what we restore. It also provides a human-readable audit log!

S3 has API available which will allow us to scan the list of backups are available:

Get Bucket Object Versions

Leveraging this, we can build a list of data that we would like to restore. This could range from a row from a point in time, or entire table at a point in time.

There are a range of tools which allow us to restore directly these backups in S3:

  1. Dynamo Incremental Restore
    The first option allows you to specify a point in time for a given prefix (folder location) in S3. The workflow is:

    1. It scans all the data available using the Version List in S3
    2. Build a list of data that is required to update.
    3. Download the file(s) required from #2, and push it to DynamoDb
  2. Dynamo Migrator
  3. DynamoDb Replicator
    A snapshot script that scans an S3 folder where incremental backups have been made, and writes the aggregate to a file on S3, providing a snapshot of the backup’s state.

We haven’t had any issues with our incremental backups, but the the next steps would be to automate the DR restore at a regular interval to ensure it provides the protection you are looking for.

DynamoDb Incremental Backups – Part Two

The next blog post in this series, we will delve into the details of our DynamoDb incremental backup solution.

If you missed the first post, check it out: Part One

I am not going to delve into DynamoDb too much. If you are reading this blog post, I will be assuming you know about DynamoDb, looking to use it, or are already using it.

DynamoDb Streams

Let’s delve into the DynamoDb Stream. DynamoDb Streams allow you to capture mutations on the data within the table. In other words, capture item changes at the point in time when they occurred.

DynamoDB Streams – High Level

This feature enables a plethora of possibilities such as data analysis, replication, triggers, and backups. It is very simply to enable (as simple as a switch), and it basically enables an ordered list of table events for a 24 hour window.

Continue reading DynamoDb Incremental Backups – Part Two

DynamoDb Incremental Backups – Part One

DynamoDb is an AWS fully managed NoSQL service, which provides a fast and predictable data store. We’ve been using it for several microservices in the past 18 months, and one feature that is sorely missed, are incremental backups.

AWS provides an option to take snapshots of your table using a service called DataPipeline. At a high level, what this does is:
1. Create an EMR (Elastic Map-Reduce) cluster
2. Perform a parallel full scan of the table in question (while consuming read units) into JSON data
3. This JSON data can be uploaded to S3 or similiar

DynamoDb to S3 Template in Data Pipeline Architect

The issue I have with this is, that the backup is not an “point in time” snapshot, it is essentially scanning the table (which can take hours) while the table is essentially still live.

Our requirements for DPO (Data Point Objective) is 30 minutes. Which basically means, if “shit hits the fan”, we can only have 30 minutes of data loss (in the worst case). This is our contractual agreement with our clients.

Given this, we have been investigating ways to solve this problem, which has led us to creating incremental backups for DynamoDb, stored in an S3 versioned bucket.

DynamoDb Incremental Backups to S3
DynamoDb Incremental Backups to S3


In the next post, I’ll walk through the details of our implementation, and provide the source code of the Lambda Function.

Pragmatism and Business Acumen

The other week, I was having lunch with our CTO at PageUp Tal Rotbart, and we were discussing various issues in the industry, where he posed a question to me that got me thinking  – “Isn’t pragmatism just business acumen?”

I’ve been pondering the question for some time now… Let me first start with defining the two.

Pragmatic: dealing with things sensibly and realistically in a way that is based on practical rather than theoretical considerations.

Business acumen: is keenness and quickness in understanding and dealing with a business situation in a manner that is likely to lead to a good outcome.

Given those terms, both seem to speak to similar traits in terms of software development, but the problem is, both can be very relative. Also, business acumen seems to be higher level concept, which can encapsulate pragmatism.

Pragmatism without business acumen can be just as deadly to a company as not having pragmatic approaches to start with.

Which drove me to start thinking about seniority levels within a development team.

Continue reading Pragmatism and Business Acumen

Microservice Scars

I have the pleasure to be presenting at the next Alt.Net meet up (in Melbourne) with Joshua Toth. We will be discussing the lessons we have learnt from our first Microservice at PageUp.

It has been in production for over 9 months (158 no downtime deployments), and it is worth sharing our experiences, and thoughts.

If you have any areas you would like us to discuss, feel free to drop me a line, tweet or just leave a comment below.

Continuous Delivery

Recently, I’m hearing a lot about continuous delivery, and even continuous deployment. Both are fantastic, and I’ve seen many places reap the benefits of such models.

Continuous delivery is when the practices used by the team enable the software to be reliably released at anytime. Every change has the confidence to go straight out to production.

Continuous Deployment is the next step, where the software is not only ready to be released, but is released out to users. Every commit out to production.

I see software which is released less often, because it is risky, and often, a scary proposition. The objective of both practices is to make deployment a “non-issue”.

If it hurts, do it more often

Left out of the conversations is the value it provides. Not from a technical perspective, but from the business. After all, that is the reason we do anything. For provide value / benefit to the business.

Our objective is to minimise our time to market, which is a another topic in its own right due to the many factors that impact that metric.

We need to ensure our software development keeps its agility, and enables quickly responding to feedback.

The main reason why these models are so beneficial is because packaging, regression testing, deploying, waiting and so on, are non-value adding activities. These activities can be mundane, tedious, and should be automated if possible.

Business should not be waiting for Technology, Technology should be waiting for the Business.

As we all strive for that goal, these two models take great strides in getting closer to the mark.

If we find resistance for these two approaches, you will most likely be dealing with software which has lost its agility, or struggling for quality, and we all know what happens in that case…

How important is UX?

User Experience has been gaining momentum and importance within companies, but have leaders really connected with what it means?

It is hard to argue, that user experience is paramount, in engaging and retaining users / customers. I’m sure most would agree that it is simply common sense.

But we are still in an age where good UX can be a market differentiator. How many companies are driving UX as a high priority?

Even more importantly, which companies have its leaders (C – Level, ie – CEO) pushing for, and driving UX?

Companies need to stop focusing on revenue, and thinking about UX. Revenue is a by-product of UX.

When we think about the metrics that top level management focus on, I assure you revenue would be very high on the list, if not top. Rightly so, but perhaps a shift in thinking, to focus on building raving fans (aka UX), will enable the revenue growth simply as a by-product, and perhaps a few other metrics such as staff satisfaction and engagement.

Food for thought…


Elastic Beanstalk is evil!

For all you AWS’ers out there, I’m hoping this will provide a reason to avoid the EB (Elastic Beanstalk) stack. A quick summary of “what is elastic beanstalk” from Stack Overflow:

Elastic Beanstalk is one layer of abstraction away from the EC2 layer. Elastic Beanstalk will setup an “environment” for you that can contain a number of EC2 instances, an optional database, as well as a few other AWS components such as a Elastic Load Balancer, Auto-Scaling Group, Security Group. Then Elastic Beanstalk will manage these items for you whenever you want to update your software running in AWS. Elastic Beanstalk doesn’t add any cost on top of these resources that it creates for you. If you have 10 hours of EC2 usage, then all you pay is 10 compute hours.

So I’ve recently finished delivering two projects using Elastic Beanstalk, and initially I was a fan. It brings everything into one dashboard / centralised control area. Your Load Balancers (ELB), Autoscaling groups (ASG), EC2, Notifications, Monitoring, Metrics and general Orchestration.

It provides a way to get zero downtime deployments, using two stacks and performing CNAME swaps (Blue / Green deployments).

Rather than creating two seperate stacks for each environment, we decided to go with rolling updates as our “zero downtime deployment” (I am not going into the details of how we achieved this – email me if you are interested and I’ll be happy to post).

This worked really well for the first few months… until we used it in anger within production. Now, if elastic beanstalk fails to deploy, it retries 2 times (total of 3) and then starts to roll back.

It will then attempt to roll back. It tries to roll back by deploying the original version on new servers. If that fails, it will try 2 more times (total of 3). If that fails as well, the EB stack goes into a grey state. I call it zombie state.

Lets not go into why this could happen (because there are many reasons why, ie – bad health checks).

Regardless of your underlying resources, your EB stack is deadlocked. You cannot recover it AT ALL.

Even if your  underlying infrastructure is fine, the web server or API is working and serving users – EB will not accept any changes, new deployments, changes in configuration until you tear down the physical resources, and rebuild.

Essentially the abstraction layer on top of these services has deadlocked and it cannot be fixed.

As soon as you tear down your EB stack, there goes your IP, and welcome DNS caching issues.

My recommendation for anyone building serious production services – avoid the EB stack. Even though it seems great at the start, it will cause you pain as time wears on.

%d bloggers like this: