DynamoDB is a fast and flexible NoSQL Database. It provides consistent, single-digit millisecond latency at any scale. Data gets stored in a key-value based model and supports the document formats JSON, HTML and XML. DynamoDB is also serverless and integrates very well with AWS Lambda functions. So if you are developing a serverless application, DynamoDB is a good fit as a database.
When to use DynamoDB
DynamoDB is a great fit for mobile, web, gaming, ad tech, IoT and many more applications. However a guideline when DynamoDB is suitable can be determined by the following requirements:
- Key-value or simple queries are present.
- Very high read/write rate is needed.
- Auto-sharding is required.
- Auto-scaling is requierd.
- Low latency is required.
- There is no size or throughput limit.
- High durability is required.
When NOT to use DynamoDB
You can summerize the not fitting requirements by the presence of complex queries on huge data sizes and or multiple tables. Or in bullet points:
- Multi-item or cross table transactions are required.
- Complex queries and joins are required.
- Real-time analytics on historic data is required.
AWS DynamoDB Characteristics
Performance: DynamoDB uses SSD drives as storage.
Resilience: It gets spread across three geographically distinct data centers.
DynamoDB Consistency Models:
- Eventually Consistent Reads (default): All entries will be copied to all databases in at least one second (best read performance).
- Strongly Consistent Reads: All writes will be reflected across all 3 locations at once (best for read consistency).
- ACID Transactions: Allows to execute multiple operations at once, like inserting and updating on one or multiple tables. It has an all or nothing behaviour, so if one of the operations fails, all of them will be rejected and vice versa. This is handy for example for payment systems.
Primary Keys/Partition Key: DynamoDB stores and retrieves data based on a primary/partition key. It has to be a unique value. The key will be stored in a hashed format.
Composite Key: Partition key plus another attribute as a sort key to achieve a unique selection.
{
"StudentID": 1222
"FirstName": "John"
"LastName": "Doe"
"Email": "johndoe@gmail.com"
}
On the example above you can see an item of DynamoDB containing some student information. The StudentID would be a great fit as a primary key as it can identify a student uniquely.
{
"StudentID": 1223
"FirstName": "Sarah"
"LastName": "Banks"
"CourseName" : "AWS_Solutions_Architetct"
"Email": "sarabanks@outlook.com"
}
However in this example above, there is another attribute called CourseName. As it can be that one student is participating multiple courses, the primary key wouldn’t be enough to identify a student in a course uniquely. So you need a key combination. In this case, it would be a good fit to combine the primary key with the sort key CourseName to achieve a unique identification. The composite key would be StudentID + CourseName.
Provisioned Throughput in Capacity Units
DynamoDB specifies how much data you can read and write with provisioned throughput. The currency for that are called read capacity units and write capacity units. If you are creating a new database you will specify them but it is also possible to change the values afterwards.
Write Capacity Units: 1 x write capacity unit = 1 x 1 KB write per second.
Read Capacity Units: 1 x read capacity unit = 1 x strongly consistent read of 4 KB per second OR 2 x eventually consistent reads of 4 KB per second. Each read operation will need 1 x capacity unit!
DynamoDB Accelerator (DAX)
DynamoDB Accelerator (DAX) is a fully managed, clustered in-memory cache for DynamoDB. It gives you a massive performance boost, where you can achieve microsecond performance for millions of requests. However, DAX is only for read operations! AWS communicates, that DAX delivers up to a 10x read performance improvement.
DAX is ideal for read-heavy applications which require a microsecond response time like auction apps, games or online shops on special promotions like Black Friday.
The performance boost is achieved by caching data. This means that write operations are getting stored at a cache cluster and the DynamoDB itself. All read operations are then first requested at the DAX cluster.
DAX is not suitable for applications that require strongly consistent reads or write-intensive applications.
DynamoDB Pricing Model – On-Demand Capacity vs Provisioned Capacity
On-Demand Capacity is a pricing model for DynamoDB. It is more flexible compared to the alternative Provisioned Capacity model. You will be charged by the read- write and storage amount. On the Provisioned Capacity model you define the capacity by yourself. So On-Demand capacity is useful if the traffic is unpredictable and/or you prefer a pay-per-use model. Where ever the Provisioned Capacity model would be preferable if the read and write capacity requirements can be forecasted and/or the application traffic is consistent or increases gradually.
DynamoDB Streams
DynamoDB Streams is a time ordered sequence of item level modifications like inserts, updates or deletes. These logs get stored for 24 hours.
Streams are suitable for serverless architectures, for example to trigger a Lambda function or archive logs.
Creating a DynamoDB
With all of the background information about DynamoDB, it is now time to create one by yourself! So login into the AWS Console and follow along the instructions.
Create a new IAM User
If you need instructions how to create an IAM user, please have a look at the article:
https://medium.com/@erwinschleier/identity-and-access-management-iam-78da48f8bb17.
Username: dynamodbadmin
Enable programmatic access
Attach an existing policies directly: AmazonDynamoDBFullAccess
Store Access key ID and Secret access key
Create EC2 Instance
A guide for creating an EC2 instance is provided here:
https://medium.com/aws-tip/aws-ec2-instance-b17adefba89c
Make sure Auto-assign PublicIp is enabled
In the Advanced Details section text box, insert:
#!/bin/bash
yum update -y
yum install git -y
Create a new key pair and download it.
SSH into the instance, if you haven’t already setup the AWS CLI, have a look here:
https://medium.com/@erwinschleier/aws-command-line-interface-cli-setup-c6e013813d21
Create a DynamoDB Table
If you are logged in into your EC2 instance, start with executing the following:
aws configure
Insert your IAM User access key and secret key and specify a region.
aws dynamodb create-table --table-name Forum --attribute-definitions \
AttributeName=Name,AttributeType=S --key-schema \
AttributeName=Name,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
Create a file containing some sample data.
nano items.json
Insert the following content into the file:
{
"Forum": [
{
"PutRequest": {
"Item": {
"Name": {"S":"Amazon DynamoDB"},
"Category": {"S":"Amazon Web Services"},
"Threads": {"N":"2"},
"Messages": {"N":"4"},
"Views": {"N":"1000"}
}
}
},
{
"PutRequest": {
"Item": {
"Name": {"S":"Amazon S3"},
"Category": {"S":"Amazon Web Services"}
}
}
}
]
}
Execute the following command to write the file content into DynamoDB:
aws dynamodb batch-write-item --request-items file://items.json
To test it, you can just query the database for the item Amazon DynamoDB.
aws dynamodb get-item --table-name Forum --region us-east-1 --key '{"Name":{"S":"Amazon DynamoDB"}}'
You can also view the table in the AWS Console DynamoDB service.
Access Management
You can restrict access to DynamoDB with IAM. So creating users within AWS with specific permissions to access and create DynamoDB tables. And IAM roles for enabling temporary access to DynamoDB.
For restricting the user access to, for example let a user just have read access to his own data inside a table, you have to use IAM policies.
{
"Statement": [
{
"Sid": "AllowAccessSpecificTable",
"Effect": "Allow",
"Action": [
"dynamodb:BatchGet*",
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:Get*",
"dynamodb:Query",
"dynamodb:Scan",
"dynamodb:BatchWrite*",
"dynamodb:CreateTable",
"dynamodb:Delete*",
"dynamodb:Update*",
"dynamodb:PutItem"
],
"Resource": "arn:aws:dynamodb:*:*:table/Forum"
}
]
}
In the code snipped you can see the Sid “AllowAccessSpecificTable” which is the statement identifier. The action attribute defines which actions are possible. And the important part is the “Resource” section where you define which database and which table are targeted, for simplicity you defined all databases but it is recommended to change the stars with the actual database ARN. Just attach this policy to a user and it becomes active.
Querying DynamoDB
There are two types of queries possible with DynamoDB: Queries or Scans.
- Query: Using only the primary key. Always sorted by Sort Key in DESC. Much more efficient than scans.
- Scan: Always dumps the entire table even if you are using filters.
Avoid Scans if you can! Queries are much more efficient. Another performance improvement can be to reduce the page size. It means to perform more smaller operations instead of one or few huge ones which can avoid throttling.
Commonly Used Commands
Command | Description |
---|---|
create-table | Creates a new table |
put-item | Adds a new item into a table or replaces one |
get-item | Returns a set of attributes for an item with the given primary key |
update-item | Allows you to edit the attributes of an existing item (also adding or deleting attributes) |
update-table | Modify a table |
list-tables | Returns a list of tables from your account |
describe-table | Returns information about the table |
scan | Reads every item in a table and returns all items and attributes. Use FilterExpression to return fewer items |
query | Queries the tables based on a partition key |
delete-item | Deletes an item based on a primary key |
delete-table | Deletes a tables including all of its items |
If you are interested in more details about DynamoDB commands, have a look at the official documentation:
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/dynamodb/index.html
A lot of topics were covered here, so I guess you will need to read through it multiple times to really understand the concepts. However, if something is still unclear, feel free to ask for more details. And if you liked the article, feel free to like it and follow my blog for further upcoming articles. Cheers!