DynamoDB: Global Secondary Indexes
Global Secondary Indexes (GSI) are a nice feature in DynamoDB, which allows us to create indexes on existing tables.
What are GSIs?
In DynamoDB, we create tables with Partition Keys and Sort Keys. Partition Keys dictate which partition the data is stored in, and Sort Keys dictate how the data is sorted within that Partition.
Once created, it isn’t possible to change an existing tables key structure, but you can create GSIs on existing tables.
GSIs provide us the ability to create alternative Partition and Sort Keys for DynamoDB. As a side effect, the data is stored redundantly, and hence additional storage costs will apply.
How does DynamoDB create GSIs?
GSIs are created asynchronously in the background. As the data is copied from the base table to the GSI, this can take time depending on the amount of data it has to copy.
When the data is being copied over, it does not consume any capacity units (RCUs or WCUs) from the base table.
A n interesting little feature (although inaccurate in my tests) is the approximate creation time when creating GSIs in the dialog box:
How can I monitor GSI creation?
DynamoDB pushes metrics which allows you to monitor the progress of the GSI. The metric name is OnlineIndexPercentageProgress and reports the percentage progress of the GSI.
Key notes about GSIs
- You can have a maximum of 5 GSIs per table
- GSIs are eventually consistent, and consistent queries on GSIs are not possible.
- GSIs have their own capacity units (RCUs and WCUs), which is not shared with the base table
- GSIs are sparse indexes, which means if the Partition / Sort Key does not exist for an item in the base table, the item will not be copied to the GSI.
- This can be used to your advantage, to locate uncommon items in a large dataset.
- You can use a GSI to isolate read activity. For example, creating a GSI with the same key schema as the base table, enables you to use the GSI as a read replica with its own capacity units.