Overview
Data Transfer from Amazon S3 Glacier Vaults to Amazon S3 restores, copies, and transfers archives stored in an Amazon Simple Storage Service Glacier (Amazon S3 Glacier) vault to an S3 bucket and storage class of your choice, including the S3 Glacier storage classes. This AWS Solution simplifies the use of your data by automating the transfer process, making archived data more accessible and cost-effective.
Note:
Amazon S3 Glacier storage classes, including Glacier Deep Archive, Glacier Flexible Retrieval, and Glacier Instant Retrieval, are different from the S3 storage classes. Visit this webpage to learn more about these storage classes.
Benefits
Automation saves time and minimizes the likelihood of human error during the data transfer process, helping ensure a more reliable and consistent operation.
Transferring data from Amazon S3 Glacier vaults to S3 buckets facilitates easier data analysis and utilization. Data is more readily accessible for applications and analytics tools, without extended restore times.
Amazon S3 storage classes allow for tagging and quicker access to your data. Tagging benefits include data classification, fine-grained access control, lifecycle management, and cost allocation.
For data that is rarely accessed, the Amazon S3 Glacier Deep Archive storage class can save almost 75% on storage costs in the AWS US East (Ohio) Region compared to an S3 Glacier vault.
Technical details
You can automatically deploy this architecture using the implementation guide and the accompanying AWS CloudFormation template.
Step 1
Invoke a transfer workflow by using an AWS Systems Manager document (SSM document).
Step 2
The SSM document starts an AWS Step Functions Orchestrator workflow.
Step 3
The Step Functions Orchestrator workflow initiates a nested Step Functions Get Inventory workflow to retrieve the inventory file.
Step 4
Upon completion of the inventory retrieval, the solution invokes the Initiate Retrieval nested Step Functions workflow.
Step 5
When a job is ready, Amazon S3 Glacier sends a notification to an Amazon Simple Notification Service (Amazon SNS) topic indicating job completion.
Step 6
The solution stores all job completion notifications in the Amazon Simple Queue Service (Amazon SQS) Notifications queue.
Step 7
When an archive job is ready, the Amazon SQS Notifications queue invokes the AWS Lambda Notifications Processor function. This Lambda function prepares the initial steps for archive retrieval.
Step 8
The Lambda Notifications Processor function places chunks retrieval messages in the Amazon SQS Chunks Retrieval queue for chunk processing.
Step 9
The Amazon SQS Chunks Retrieval queue invokes the Lambda Chunk Retrieval function to process each chunk.
Step 10
The Lambda Chunk Retrieval function downloads the chunk from the Amazon S3 Glacier vault.
Step 11
The Lambda Chunk Retrieval function uploads a multipart upload part to Amazon Simple Storage Service (Amazon S3).
Step 12
After a new chunk is downloaded, the solution stores chunk metadata in Amazon DynamoDB (etag, checksum_sha_256, tree_checksum).
Step 13
The Lambda Chunk Retrieval function verifies whether all chunks for that archive have been processed. If yes, it inserts an event into the Amazon SQS Validation queue to invoke the Lambda Validate function.
Step 14
The Lambda Validate function performs an integrity check and then closes the Amazon S3 multipart upload.
Step 15
A DynamoDB stream invokes the Lambda Metrics Processor function to update the transfer process metrics in DynamoDB.
Step 16
The Step Functions Orchestrator workflow enters an async wait, pausing until the archive retrieval workflow concludes before initiating the Step Functions Cleanup workflow.
Step 17
The DynamoDB stream invokes the Lambda Async Facilitator function, which unlocks asynchronous waits in Step Functions.
Step 18
The Amazon EventBridge rules periodically initiate Step Functions Extend Download Window and Update Amazon CloudWatch Dashboard workflows.
Step 19
Monitor the transfer progress by using the CloudWatch dashboard.
Related content
S3 Glacier is a secure and durable service for low-cost data archiving and long-term backup using vaults.
This self-paced Workshop provides a step-by-step guide for launching the AWS Solution, Data Transfer from Amazon S3 Glacier Vaults to Amazon S3, in your AWS account.
- Publish Date