Choosing the Right DynamoDB Partition Key
Choosing the right partition key is essential in designing and building scalable and reliable applications on top of DynamoDB.
What is a partition key?
DynamoDB supports two types of primary keys:
- Partition key: A simple primary key composed of one attribute known as the partition key. Details in DynamoDB are similar in many ways to fields or columns in other database systems.
- Partition key and sort key: Referred to as a composite primary key, this type comprises two attributes. The first attribute is the partition key, and the second is the sort key. The sort key value sorts all data under a partition key. The following is an example.
Why do I need a partition key?
DynamoDB stores data as groups of attributes known as items. Items are similar to rows or records in other database systems. DynamoDB stores and retrieves each item based on the primary key value, which must be unique. Items are distributed across 10-GB storage units, called partitions (physical storage internal to DynamoDB). Each table has one or more partitions, as shown in the illustration. See Partitions and Data Distribution in the DynamoDB Developer Guide for more information.
DynamoDB inputs the partition key’s value to an internal hash function. The output from the hash function determines the partition in which the item is stored. Each item’s location is determined by the hash value of its partition key.
In most cases, all items with the same partition key are stored together in a collection, which we define as a group of things with the same partition key but different sort keys. The sort key may be a partition boundary for tables with composite primary keys. DynamoDB splits partitions by sort key if the collection size grows more significant than 10 GB.
Partition keys and request throttling
DynamoDB automatically supports your access patterns using the throughput you have provisioned or up to your account limits in the on-demand mode. Regardless of the capacity mode you choose, if your access pattern exceeds 3000 RCU or 1000 WCU for a single partition key value, your requests might be throttled with a ProvisionedThroughputExceededException error.
Reading or writing above the limit can be caused by these issues:
- Uneven distribution of data due to the wrong choice of partition key
- Frequent access to the same key in a partition (the most popular item, also known as a hotkey)
- A request rate more significant than the provisioned throughput or on-demand account limits
To avoid request throttling, design your DynamoDB table with the correct partition key to meet your access requirements and provide even data distribution.
This blog post covers essential considerations and strategies for choosing the right partition key for designing a schema that uses Amazon DynamoDB.