Amazon SimpleDB provides a simple web services interface to create and store multiple data sets, query your data easily, and return the results. Your data is automatically indexed, making it easy to quickly find the information that you need. There is no need to pre-define a schema or change a schema if new data is added later. And scale-out is as simple as creating new domains, rather than building out new servers.
To use Amazon SimpleDB you:
- Build your data set
- Choose a Region for your Domain(s) to optimize for latency, minimize costs, or address regulatory requirements. Amazon SimpleDB is currently available in the US East (Northern Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), and South America (Sao Paulo) Regions.
- Use CreateDomain, DeleteDomain, ListDomains, DomainMetadata to create and manage query domains
- Use Put, Batch Put, & Delete to create and manage the data set within each query domain
- Retrieve your data
- Use GetAttributes to retrieve a specific item
- Use Select to query your data set for items that meet specified criteria
- Pay only for the resources that you consume
The data model used by Amazon SimpleDB makes it easy to store, manage and query your structured data. Developers organize their data-set into domains and can run queries across all of the data stored in a particular domain. Domains are collections of items that are described by attribute-value pairs.
Think of these terms as analogous to concepts in a traditional spreadsheet table. For example, take the details of a customer management database shown in the table below and consider how they would be represented in Amazon SimpleDB. The whole table would be your domain named “customers.” Individual customers would be rows in the table or items in your domain. The contact information would be described by column headers (attributes). Values are in individual cells. Now imagine the records below are new customers you would like to add to your domain.
CustomerID | First name | Last name | Street address | City | State | Zip | Telephone |
---|---|---|---|---|---|---|---|
123 | Bob | Smith | 123 Main St | Springfield | MO | 65801 | 222-333-4444 |
456 | James | Johnson | 456 Front St | Seattle | WA | 98104 | 333-444-5555 |
In Amazon SimpleDB, to add the records above, you would PUT the CustomerIDs into your domain along with the attribute-value pairs for each of the customers. Without the specific syntax, it would look something like this:
PUT (item, 123), (First name, Bob), (Last name, Smith), (Street address, 123 Main St.), (City, Springfield), (State, MO), (Zip, 65801), (Telephone, 222-333-4444) PUT (item, 456), (First name, James), (Last name, Johnson), (Street address, 456 Front St.), (City, Seattle), (State, WA), (Zip, 98104), (Telephone, 333-444-5555)
Amazon SimpleDB differs from tables of traditional databases in important ways. You have the flexibility to easily go back later on and add new attributes that only apply to certain records. For example, imagine you begin to capture your customers’ email addresses to enable real-time alerts on order status. Rather than rebuilding your “customer” table, re-writing your queries, rebuilding your indices, and so on, you would simply add the new records and any additional attributes to your existing “customers” domain. The resulting domain might look something like this:
CustomerID | First name | Last name | Street address | City | State | Zip | Telephone | |
---|---|---|---|---|---|---|---|---|
123 | Bob | Smith | 123 Main St | Springfield | MO | 65801 | 222-333-4444 | |
456 | James | Johnson | 456 Front St | Seattle | WA | 98104 | 333-444-5555 | |
789 | Deborah | Thomas | 789 Garfield | New York | NY | 10001 | 444-555-6666 | dthomas@xyz.com |
Amazon SimpleDB provides a small number of simple API calls which implement writing, indexing and querying data. The interface and feature set are intentionally focused on core functionality, providing a basic API for developers to build upon and making the service easy to learn and simple to use.
- CreateDomain — Create a domain that contains your dataset.
- DeleteDomain — Delete a domain.
- ListDomains — List all domains.
- DomainMetadata — Retrieve information about creation time for the domain, storage information both as counts of item names and attributes, as well as total size in bytes.
- PutAttributes — Add or update an item and its attributes, or add attribute-value pairs to items that exist already. Items are automatically indexed as they are received.
- BatchPutAttributes — For greater overall throughput of bulk writes, perform up to 25 PutAttribute operations in a single call.
- DeleteAttributes — Delete an item, an attribute, or an attribute value.
- BatchDeleteAttributes — For greater overall throughput of bulk deletes, perform up to 25 DeleteAttributes operations in a single call.
- GetAttributes — Retrieve an item and all or a subset of its attributes and values.
- Select — Query the data set in the familiar, “select target from domain_name where query_expression” syntax. Supported value tests are: =, !=, =, like, not like, between, is null, is not null, and every (). Example: select * from mydomain where every(keyword) = ‘Book’. Order results using the SORT operator, and count items that meet the condition(s) specified by the predicate(s) in a query using the Count operator.
Note: Amazon SimpleDB has been integrated with AWS Identity and Access Management to enable fine-grained control over Amazon SimpleDB resources. Through integration with AWS Identity and Access Management, an AWS Account signed up to use SimpleDB can create multiple Users. In turn, these Users may be granted SimpleDB API level permissions to access the SimpleDB domains owned by the AWS Account. See the AWS Identity and Access Management detail page for additional details.
Amazon SimpleDB stores multiple geographically distributed copies of each domain to enable high availability and data durability. A successful write (using PutAttributes, BatchPutAttributes, DeleteAttributes, BatchDeleteAttributes, CreateDomain or DeleteDomain) means that all copies of the domain will durably persist. Amazon SimpleDB supports two read consistency options: eventually consistent reads and consistent reads.
- Eventually Consistent Reads (Default) — the eventual consistency option maximizes your read performance (in terms of low latency and high throughput). However, an eventually consistent read (using Select or GetAttributes) might not reflect the results of a recently completed write (using PutAttributes, BatchPutAttributes, DeleteAttributes, BatchDeleteAttributes). Consistency across all copies of data is usually reached within a second; repeating a read after a short time should return the updated data.
- Consistent Reads — in addition to eventual consistency, Amazon SimpleDB also gives you the flexibility and control to request a consistent read if your application, or an element of your application, requires it. A consistent read (using Select or GetAttributes with ConsistentRead=true) returns a result that reflects all writes that received a successful response prior to the read.
By default, GetAttributes and Select perform an eventually consistent read. Since a consistent read can potentially incur higher latency and lower read throughput it is best to use it only when an application scenario mandates that a read operation absolutely needs to read all writes that received a successful response prior to that read. For all other scenarios the default eventually consistent read will yield the best performance. Note also that Amazon SimpleDB allows you to specify consistency settings for each individual read request, so the same application could have disparate parts following different consistency settings.
Amazon SimpleDB is not a relational database and sacrifices complex transactions and relations (i.e., joins) in order to provide unique functionality and performance characteristics. However, Amazon SimpleDB does offer transactional semantics such as:
- Conditional Puts/Deletes — enable you to insert, replace, or delete values for one or more attributes of an item if the existing value of an attribute matches the value you specify. If the value does not match or is not present, the update is rejected. Conditional Puts/Deletes are useful for preventing lost updates when different sources write concurrently to the same item.
Conditional puts and deletes are exposed via the PutAttributes and DeleteAttributes APIs by specifying an optional condition with an expected value. For example, if your application was reserving seats or selling tickets to an event, you might allow a purchase (i.e., write update) only if the specified seat was still available (the optional condition). These semantics can also be used to implement functionality such as counters, inserting an item only if it does not already exist, and optimistic concurrency control (OCC). An application can implement OCC by maintaining a version number (or a timestamp) attribute as part of an item and by performing a conditional put/delete based on the value of this version number.
To learn more about transactional semantics or consistency with Amazon SimpleDB, please refer to the Amazon SimpleDB Developer Guide or Consistency Enhancements Whitepaper.
Amazon RDS enables you to run a fully featured relational database while offloading database administration. Amazon DynamoDB is a fully managed NoSQL database service that provides extremely fast and predictable performance with seamless scalability. Amazon SimpleDB provides a non-relational service designed for smaller datasets. Using one of the many AMIs on Amazon EC2 and Amazon EBS gives you full control over your database without the burden of provisioning and installing hardware.
There are important differences between these alternatives that may make one more appropriate for your use case.
Visit the Cloud Databases on AWS page for more detailed information on the various database alternatives for your applications.
Unlike Amazon S3, Amazon SimpleDB is not storing raw data. Rather, it takes your data as input and expands it to create multiple indices, thereby enabling you to quickly query that data. Additionally, Amazon S3 and Amazon SimpleDB use different types of physical storage. Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively. Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed.
In order to optimize your costs across AWS services, large objects or files should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon SimpleDB. Because of the close integration between services and the free data transfer within the AWS environment, developers can easily take advantage of both the speed and querying capabilities of Amazon SimpleDB as well as the low cost of storing data in Amazon S3, by integrating both services into their applications.
Amazon SimpleDB currently enables individual domains to grow up to 10 GB each. If your data set is larger than 10 GB, simply take advantage of Amazon SimpleDB’s scale-out architecture and spread your data over multiple domains. Since Amazon SimpleDB is designed with parallelism in mind, spreading your data over more domains will also increase your write and read throughput potential. You are initially allocated a maximum of 250 domains; please complete this form if you require additional domains.
For more information on how many developers benefit from using Amazon SimpleDB in conjunction with Amazon S3, click here.
With Amazon SimpleDB, the best way to predict the size of your structured data storage is as follows:
Raw byte size (GB) of all item IDs + 45 bytes per item + Raw byte size (GB) of all attribute names + 45 bytes per attribute name + Raw byte size (GB) of all attribute-value pairs + 45 bytes per attribute-value pair
To calculate your estimated monthly storage cost for the US East (Northern Virginia) Region or US West (Oregon) Region, take the resulting size in GB and multiply by $0.25. For the EU (Ireland) Region, Asia Pacific (Singapore) Region, Asia Pacific (Sydney) Region, or the US West (Northern California) Region, take the resulting size in GB and multiply by $.275. For the Asia Pacific (Tokyo) Region, take the resulting size in GB and multiply by $0.276. For the South America (Sao Paulo) Region, take the resulting size in GB and multiply by $0.34.
Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (SELECT, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor. Machine utilization is driven by the amount of data (# of attributes, length of attributes) processed by each request. A GET operation that retrieves 256 attributes will use more resources than a GET that retrieves only 1 attribute. A multi-predicate SELECT that examines 100,000 attributes will cost more than a single predicate query that examines 250.
In the response message for each request, Amazon SimpleDB returns a field called Box Usage. Box Usage is the measure of machine resources consumed by each request. It does not include bandwidth or storage. Box usage is reported as the portion of a machine hour used to complete a particular request. For the US East (Northern Virginia) Region and US West (Oregon) Region, the cost of an individual request is Box Usage (expressed in hours) * $0.14 per Amazon SimpleDB Machine Hour. The cost of all your requests is the sum of Box Usage (expressed in hours) * $0.14.
For example, if over the course of a month, the sum of the Box Usage for your requests uses the equivalent of one 1.7 GHz Xeon processor for 9 hours, your charge will be:
9 hours * $0.14 per Amazon SimpleDB Machine Hour = $1.26.
If your query domains are located in the EU (Ireland) Region, Asia Pacific (Tokyo), Asia Pacific (Singapore) Region, Asia Pacific (Sydney) Region, or US West (Northern California) Region, Amazon SimpleDB Machine Hours are priced at $.154 per Machine hour. If your query domains are located in the South America (Sao Paulo) Region, Amazon SimpleDB Machine Hours are priced at $0.19 per Machine Hour. All cost calculations should be adjusted to reflect pricing in the relevant region.
Your use of this service is subject to the Amazon Web Services Customer Agreement.