DynamoDB, Explained
Monday, April 30th, 2012What is DynamoDB?
DynamoDB is a NoSQL database service offered by Amazon Web Services. It is designed to seamlessly scale in terms of the amount of data and the read/write request volume. You tell it how many writes per second and how many reads per second you want to be able to handle, and it takes care of partitioning your data across the required amount of hardware.
It is a key-value store meaning that the primary way of putting and getting data is by the primary index. There are no secondary indexes. (yet?) The primary index is the main key which can either be a single hash key, or a hash key and a range key. The hash key is what DynamoDB uses to partition your data across machines. Because of this, you should make sure that that the read/write request volume is evenly distributed across different hash keys. If you have one hash key that gets a lot of writes, all those writes will go to the same partition, and use up all of your write throughput for that partition even if you have more writes per second available in other partitions.
In addition to getting items out of DynamoDB by using their key, there are two other ways you can get items. DynamoDB implements scan and query functions. The scan is like a full table scan. Every item in the datastore is looked at. You can filter based on attributes in the item, but the performance will still be based on the total number of items in the table, not the number of items returned. Query retrieves a subset of items from the table based on key. You specify a single hash key, and a condition for the range key such that all the range keys returned in the query are next to each other in the table. Query performance is based on how many items are returned, not how many are in the table.
Hopefully that helps! Leave a comment if you have questions.