Amazon Bedrock Guardrails
Implement safeguards customized to your application requirements and responsible AI policiesBuild responsible AI applications with Amazon Bedrock Guardrails
Amazon Bedrock Guardrails provides additional customizable safeguards on top of the native protections of FMs, delivering safety protections that is among the best in the industry by:
- Blocking as much as 85% more harmful content
- Filtering over 75% hallucinated responses for RAG and summarization workloads
- Enabling customers to customize and apply safety, privacy and truthfulness protections within a single solution
Bring a consistent level of AI safety across all your applications
Amazon Bedrock Guardrails helps evaluate user inputs and FM responses based on use case specific policies, and provides an additional layer of safeguards regardless of the underlying FM. Amazon Bedrock Guardrails is the only responsible AI capability offered by a major cloud provider that helps customers to build and customize safety, privacy, and truthfulness protections for their generative AI applications in a single solution, and it works with all large language models (LLMs) in Amazon Bedrock, as well as fine-tuned models. Customers can create multiple guardrails, each configured with a different combination of controls, and use these guardrails across different applications and use cases. Amazon Bedrock Guardrails can also be integrated with Amazon Bedrock Agents and Amazon Bedrock Knowledge Bases to build generative AI applications aligned with your responsible AI policies. In addition, Amazon Bedrock Guardrails offers a ApplyGuardrail API to help evaluate user inputs and model responses generated by any custom or third-party FM outside of Bedrock.
Block undesirable topics in your generative AI applications
Organizations recognize the need to manage interactions within generative AI applications for a relevant and safe user experience. They want to further customize interactions to remain on topics relevant to their business and align with company policies. Using a short natural language description, Amazon Bedrock Guardrails helps you to define a set of topics to avoid within the context of your application. Amazon Bedrock Guardrails helps detect and block user inputs and FM responses that fall into the restricted topics. For example, a banking assistant can be designed to avoid topics related to investment advice.
Filter harmful content based on your responsible AI policies
Amazon Bedrock Guardrails provides content filters with configurable thresholds to filter harmful content across hate, insults, sexual, violence, misconduct (including criminal activity), and safeguard against prompt attacks (prompt injection and jailbreak). Most FMs already provide built-in protections to prevent the generation of harmful responses. In addition to these protections, Amazon Bedrock Guardrails lets you configure thresholds across the different content categories to filter out harmful interactions. Increasing the strength of the filter increases the aggressiveness of the filtering. They automatically evaluate both user input and model responses to detect and help prevent content that falls into restricted categories. For example, an ecommerce site can design its online assistant to avoid using inappropriate language, such as hate speech or insults.
Redact sensitive information (PII) to protect privacy
Amazon Bedrock Guardrails can help you to detect sensitive content such as personally identifiable information (PII) in user inputs and FM responses. You can select from a list of predefined PII or define custom sensitive information type using regular expressions (RegEx). Based on the use case, you can selectively reject inputs containing sensitive information or redact them in FM responses. For example, you can redact users’ personal information while generating summaries from customer and agent conversation transcripts in a call center.
Block inappropriate content with a custom word filter
Amazon Bedrock Guardrails helps you to configure a set of custom words or phrases that you want to detect and block in the interaction between your users and generative AI applications. This will also help you to detect and block profanity as well as specific custom words such as competitor names or other offensive words.
Detect hallucinations in model responses using contextual grounding checks
Organizations need to deploy truthful and trustworthy generative AI applications to maintain and grow users’ trust. However, applications built using FMs can generate incorrect information due to hallucinations. For example, FMs can generate responses that deviate from the source information, conflate multiple pieces of information, or invent new information. Amazon Bedrock Guardrails supports contextual grounding checks to help detect and filter hallucinations if the responses are not grounded (e.g., factually inaccurate or new information) in the source information and irrelevant to user’s query or instruction. Contextual grounding checks can help detect hallucinations for RAG, summarization, and conversational applications, where source information can be used as reference to validate the model response.
Next steps
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages.