AWS Resilience Hub

Prepare and protect your applications from disruptions

Benefits

Continuously validate and track application resilience to reduce outages
Evaluate resilience targets (Recovery Time Objective and Recovery Point Objective).
Identify and resolve issues before they occur in production.
Optimize business continuity while reducing recovery costs.

How it works

AWS Resilience Hub is a central location in the AWS Console for you to manage and improve the resilience posture of your applications on AWS. AWS Resilience Hub enables you to define your resilience goals, assess your resilience posture against those goals, and implement recommendations for improvement based on the AWS Well-Architected Framework. Within AWS Resilience Hub, you can also create and run AWS Fault Injection Service (AWS FIS) experiments, which mimic real-life disruptions to your application to help you better understand dependencies and uncover potential weaknesses.

AWS Resilience Hub provides you with the services and tooling you need to continuously strengthen your resilience posture, all in a single place.

Features

Describe your applications as resource collections, such as CloudFormation stacks, Terraform state files, myApplications, or resource groups, or define applications for Kubernetes workloads that are managed on Amazon EKS. Applications can also be described using both resource collections and Amazon EKS clusters.

Define the resilience policies for your applications. These policies include RTO and RPO targets for applications, infrastructure, Availability Zone, and Region disruptions.

AWS Resilience Hub’s assessment uses best practices from the AWS Well-Architected Framework to analyze the components of an application and uncover potential resilience weaknesses. These can be caused by incomplete infrastructure setup, misconfigurations, or situations where additional configuration improvements are needed.

AWS Resilience Hub provides actionable recommendations to improve resilience. The resilience assessment also generates code snippets that help you create recovery procedures as AWS Systems Manager documents for your applications, referred to as Standard Operating Procedures (SOPs). AWS Resilience Hub generates a list of recommended Amazon CloudWatch monitors and alarms to help the operator quickly identify any change to the application's resilience posture once deployed.

After the application and SOPs have been updated to incorporate recommendations from the resilience assessment, you can use AWS Resilience Hub to test and verify that your application can meet its resilience targets before releasing it into production. AWS Resilience Hub is integrated with AWS Fault Injection Simulator (FIS), a chaos engineering service, to provide fault-injection simulations of real-world failures to validate that the application recovers within the defined resilience targets. This can include network errors or too many open connections to a database. AWS Resilience Hub also provides APIs so you can integrate its resilience assessment and testing into your CI/CD pipelines for ongoing resilience validation. Integrating resilience validation into CI/CD pipelines helps ensure that changes to the application's underlying infrastructure do not compromise resilience.

Use cases

Uses fault-injection simulations of real-world failures to help validate the effectiveness of recovery standard operating procedures (SOP) and alarms.

Provides actionable recommendations to improve resilience and helps you create recovery procedures.

Keeps an audit trail of events during planned and unplanned outages, helping meet compliance and regulatory requirements.

  • Pearson

    Pearson taps AWS Resilience Hub to improve application resilience. Watch the video.

    With AWS Resilience Hub, we could take a look at what our applications do...and ask ourselves 'is this a mission-critical application or can it go down for a little while and not impact our operations?' AWS Resilience Hub was critical to that because we were able to input values and very quickly understand what applications actually are important to Pearson.

    Ronnie Kendrick, Senior SRE Manager, Infrastructure and Operations at Pearson
  • Aval Digital Labs

    ADL Digital Labs (ADL) was born in 2017 and today it is one of the best platforms for boosting digital products for financial services industry in Latin America. Recognizing the importance of delivering highly reliable solutions to its customers, ADL has incorporated AWS Resilience Hub to verify and track the resilience posture of its applications while maintaining visibility into policy compliance and availability targets. The integration of AWS Resilience Hub into ADL's business continuity framework resulted in the validation of resilience and business continuity posture for eight transactional channels, serving around 4 million clients across four of Colombia's major financial entities.

    Alexander Chaparro, Head of Architecture, Aval Digital Labs

Explore more of AWS