aws logo

AWS DynamoDB – What is it and How to create one

DynamoDB is a fast and flexible NoSQL Database. It provides consistent, single-digit millisecond latency at any scale. Data gets stored in a key-value based model and supports the document formats JSON, HTML and XML. DynamoDB is also serverless and integrates very well with AWS Lambda functions. So if you are developing a serverless application, DynamoDB is a good fit as a database.

When to use DynamoDB

DynamoDB is a great fit for mobile, web, gaming, ad tech, IoT and many more applications. However a guideline when DynamoDB is suitable can be determined by the following requirements:

  • Key-value or simple queries are present.
  • Very high read/write rate is needed.
  • Auto-sharding is required.
  • Auto-scaling is requierd.
  • Low latency is required.
  • There is no size or throughput limit.
  • High durability is required.

When NOT to use DynamoDB

You can summerize the not fitting requirements by the presence of complex queries on huge data sizes and or multiple tables. Or in bullet points:

  • Multi-item or cross table transactions are required.
  • Complex queries and joins are required.
  • Real-time analytics on historic data is required.

AWS DynamoDB Characteristics

Performance: DynamoDB uses SSD drives as storage.

Resilience: It gets spread across three geographically distinct data centers.

DynamoDB Consistency Models:

  • Eventually Consistent Reads (default): All entries will be copied to all databases in at least one second (best read performance).
  • Strongly Consistent Reads: All writes will be reflected across all 3 locations at once (best for read consistency).
  • ACID Transactions: Allows to execute multiple operations at once, like inserting and updating on one or multiple tables. It has an all or nothing behaviour, so if one of the operations fails, all of them will be rejected and vice versa. This is handy for example for payment systems.

Primary Keys/Partition Key: DynamoDB stores and retrieves data based on a primary/partition key. It has to be a unique value. The key will be stored in a hashed format.

Composite Key: Partition key plus another attribute as a sort key to achieve a unique selection.

{
	"StudentID": 1222
	"FirstName": "John"
	"LastName": "Doe"
	"Email": "johndoe@gmail.com"
}

On the example above you can see an item of DynamoDB containing some student information. The StudentID would be a great fit as a primary key as it can identify a student uniquely.

{
	"StudentID": 1223
	"FirstName": "Sarah"
	"LastName": "Banks"
	"CourseName" : "AWS_Solutions_Architetct"
	"Email": "sarabanks@outlook.com"
}

However in this example above, there is another attribute called CourseName. As it can be that one student is participating multiple courses, the primary key wouldn’t be enough to identify a student in a course uniquely. So you need a key combination. In this case, it would be a good fit to combine the primary key with the sort key CourseName to achieve a unique identification. The composite key would be StudentID + CourseName.

Provisioned Throughput in Capacity Units

DynamoDB specifies how much data you can read and write with provisioned throughput. The currency for that are called read capacity units and write capacity units. If you are creating a new database you will specify them but it is also possible to change the values afterwards.

Write Capacity Units: 1 x write capacity unit = 1 x 1 KB write per second.

Read Capacity Units: 1 x read capacity unit = 1 x strongly consistent read of 4 KB per second OR 2 x eventually consistent reads of 4 KB per second. Each read operation will need 1 x capacity unit!

DynamoDB Accelerator (DAX)

DynamoDB Accelerator (DAX) is a fully managed, clustered in-memory cache for DynamoDB. It gives you a massive performance boost, where you can achieve microsecond performance for millions of requests. However, DAX is only for read operations! AWS communicates, that DAX delivers up to a 10x read performance improvement.

DAX is ideal for read-heavy applications which require a microsecond response time like auction apps, games or online shops on special promotions like Black Friday.

The performance boost is achieved by caching data. This means that write operations are getting stored at a cache cluster and the DynamoDB itself. All read operations are then first requested at the DAX cluster.

DAX is not suitable for applications that require strongly consistent reads or write-intensive applications.

DynamoDB Pricing Model – On-Demand Capacity vs Provisioned Capacity

On-Demand Capacity is a pricing model for DynamoDB. It is more flexible compared to the alternative Provisioned Capacity model. You will be charged by the read- write and storage amount. On the Provisioned Capacity model you define the capacity by yourself. So On-Demand capacity is useful if the traffic is unpredictable and/or you prefer a pay-per-use model. Where ever the Provisioned Capacity model would be preferable if the read and write capacity requirements can be forecasted and/or the application traffic is consistent or increases gradually.

DynamoDB Streams

DynamoDB Streams is a time ordered sequence of item level modifications like inserts, updates or deletes. These logs get stored for 24 hours.

Streams are suitable for serverless architectures, for example to trigger a Lambda function or archive logs.

Creating a DynamoDB

With all of the background information about DynamoDB, it is now time to create one by yourself! So login into the AWS Console and follow along the instructions.

Create a new IAM User

If you need instructions how to create an IAM user, please have a look at the article:

https://medium.com/@erwinschleier/identity-and-access-management-iam-78da48f8bb17.

Username: dynamodbadmin

Enable programmatic access

Attach an existing policies directly: AmazonDynamoDBFullAccess

Store Access key ID and Secret access key

Create EC2 Instance

A guide for creating an EC2 instance is provided here:

https://medium.com/aws-tip/aws-ec2-instance-b17adefba89c

Make sure Auto-assign PublicIp is enabled

In the Advanced Details section text box, insert:

#!/bin/bash
yum update -y
yum install git -y

Create a new key pair and download it.

SSH into the instance, if you haven’t already setup the AWS CLI, have a look here:

https://medium.com/@erwinschleier/aws-command-line-interface-cli-setup-c6e013813d21

Create a DynamoDB Table

If you are logged in into your EC2 instance, start with executing the following:

aws configure

Insert your IAM User access key and secret key and specify a region.

aws dynamodb create-table --table-name Forum --attribute-definitions \
AttributeName=Name,AttributeType=S --key-schema \
AttributeName=Name,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Create a file containing some sample data.

nano items.json

Insert the following content into the file:

{
	"Forum": [
    	{
        	"PutRequest": {
            	"Item": {
                	"Name": {"S":"Amazon DynamoDB"},
                	"Category": {"S":"Amazon Web Services"},
                	"Threads": {"N":"2"},
                	"Messages": {"N":"4"},
                	"Views": {"N":"1000"}
            	}
        	}
    	},
    	{
        	"PutRequest": {
            	"Item": {
                	"Name": {"S":"Amazon S3"},
                	"Category": {"S":"Amazon Web Services"}
            	}
        	}
    	}
	]
}

Execute the following command to write the file content into DynamoDB:

aws dynamodb batch-write-item --request-items file://items.json

To test it, you can just query the database for the item Amazon DynamoDB.

aws dynamodb get-item --table-name Forum --region us-east-1 --key '{"Name":{"S":"Amazon DynamoDB"}}'

You can also view the table in the AWS Console DynamoDB service.

Access Management

You can restrict access to DynamoDB with IAM. So creating users within AWS with specific permissions to access and create DynamoDB tables. And IAM roles for enabling temporary access to DynamoDB.

For restricting the user access to, for example let a user just have read access to his own data inside a table, you have to use IAM policies.

{
    "Statement": [
        {
            "Sid": "AllowAccessSpecificTable",
            "Effect": "Allow",
            "Action": [
                "dynamodb:BatchGet*",
                "dynamodb:DescribeStream",
                "dynamodb:DescribeTable",
                "dynamodb:Get*",
                "dynamodb:Query",
                "dynamodb:Scan",
                "dynamodb:BatchWrite*",
                "dynamodb:CreateTable",
                "dynamodb:Delete*",
                "dynamodb:Update*",
                "dynamodb:PutItem"
            ],
            "Resource": "arn:aws:dynamodb:*:*:table/Forum"
        }
    ]
}

In the code snipped you can see the Sid “AllowAccessSpecificTable” which is the statement identifier. The action attribute defines which actions are possible. And the important part is the “Resource” section where you define which database and which table are targeted, for simplicity you defined all databases but it is recommended to change the stars with the actual database ARN. Just attach this policy to a user and it becomes active.

Querying DynamoDB

There are two types of queries possible with DynamoDB: Queries or Scans.

  • Query: Using only the primary key. Always sorted by Sort Key in DESC. Much more efficient than scans.
  • Scan: Always dumps the entire table even if you are using filters.

Avoid Scans if you can! Queries are much more efficient. Another performance improvement can be to reduce the page size. It means to perform more smaller operations instead of one or few huge ones which can avoid throttling.

Commonly Used Commands

Command Description
create-table Creates a new table
put-item Adds a new item into a table or replaces one
get-item Returns a set of attributes for an item with the given primary key
update-item Allows you to edit the attributes of an existing item (also adding or deleting attributes)
update-table Modify a table
list-tables Returns a list of tables from your account
describe-table Returns information about the table
scan Reads every item in a table and returns all items and attributes. Use FilterExpression to return fewer items
query Queries the tables based on a partition key
delete-item Deletes an item based on a primary key
delete-table Deletes a tables including all of its items

If you are interested in more details about DynamoDB commands, have a look at the official documentation:

https://awscli.amazonaws.com/v2/documentation/api/latest/reference/dynamodb/index.html

A lot of topics were covered here, so I guess you will need to read through it multiple times to really understand the concepts. However, if something is still unclear, feel free to ask for more details. And if you liked the article, feel free to like it and follow my blog for further upcoming articles. Cheers!

Leave a Comment

Your email address will not be published.