Amazon SageMaker Data and AI Governance FAQs
Page Topics
Data and AI GovernanceData and AI Governance
What is Data and AI Governance in Amazon SageMaker?
The next generation of Amazon SageMaker simplifies the discovery, governance, and collaboration for data and AI across your lakehouse, AI models, and applications. With Amazon SageMaker Catalog, built on Amazon DataZone, users can securely discover and access approved data and models using semantic search with generative AI created metadata, or you could just ask Q Developer with natural language to find your data. Users can define and enforce access policies consistently using a single permission model with fine-grained access controls centrally in the SageMaker Unified Studio (preview). Seamlessly share and collaborate on data and AI assets through easy publishing and subscribing workflows. With Amazon SageMaker, you can safeguard and protect your AI models using Amazon Bedrock guardrails and implement responsible AI policies. Build trust throughout your organization with data quality monitoring and automation, sensitive data detection and data and ML lineage.
How can I interact with Amazon SageMaker Catalog?
You can access SageMaker Catalog through the Amazon SageMaker Unified Studio (preview), which is a single environment for data and AI development. To programmatically set up, configure, or integrate with existing processes, SageMaker Catalog has APIs published with guidelines on how to use existing Amazon DataZone APIs.
What are the top challenges it solves?
- Difficulty in finding and sharing data across teams: Data producers and consumers often face challenges in quickly locating and sharing relevant datasets across the organization. This inefficiency leads to wasted time searching for data and limits collaboration.
- Lack of trust in data quality and AI model outputs: Organizations struggle to trust the quality of their data and the accuracy of AI model outputs due to a lack of visibility into data origins, quality, and access patterns.
- Inconsistent data access and privacy violations: Organizations struggle to enforce consistent data access policies, leading to potential unauthorized access to sensitive information.
- Difficulty maintaining compliance with regulations and internal policies: Organizations find it challenging to maintain regulatory compliance and adhere to internal policies due to a lack of comprehensive auditing and monitoring tools.
What are the top benefits?
The data and AI Governance Amazon in Amazon SageMaker helps data teams with:
- Faster data discovery and collaboration: Users can quickly find and share relevant data across the organization, reducing time spent searching for information and promoting teamwork.
- Improved trust through lineage and quality: Tracking data origin and improving data quality to increase confidence in data-driven decisions and AI model outputs.
- Enhanced data and AI model security: Securing data and Models to only be accessible via projects, it ensures only those authorized to see the assets in the project can access it, maintaining security and privacy standards.
- Reduced business risk and better regulatory compliance: Logging activities help organizations align to industry regulations and internal policies, helping to reduce organizational risks.
What are the key use cases?
- Unlock business productivity with asset search and discovery: Search and discover data and AI assets to empower teams, reduce time spent finding critical assets, and drive faster, data-driven decision-making.
- Centralized data access policy management: Define and manage data access rules from a single point, leading to consistent application across various AWS services and third-party environments.
- Data enrichment with business context and classification: Add metadata and categorization to datasets, making it easier for users to understand data relevance and applicability to specific business needs.
- Log activities for users and systems: Monitor and record interactions with data and AI systems, providing visibility into usage patterns and potential security issues.
- AI/ML data governance implementation: Extend data governance principles to AI and machine learning processes, ensuring that only approved data is used in model training and that AI systems adhere to defined permissions and ethical guidelines.
What is the relationship between Amazon SageMaker Catalog and Amazon DataZone?
Amazon SageMaker Catalog is built on Amazon DataZone, offering the same governance capabilities in a unified user experience. Amazon DataZone experience continues to stay as is to enable existing Amazon DataZone customers to continue using the familiar interface if they so desire.
What is the pricing model for Amazon SageMaker Data and AI Governance?
The pricing details can be found here: https://aws.amazon.com/datazone/pricing/.