General
Amazon Managed Streaming for Apache Kafka (Amazon MSK) is an AWS streaming data service that manages Apache Kafka infrastructure and operations, making it easy for developers and DevOps managers to run Apache Kafka applications and Apache Kafka Connect connectors on AWS, without the need to become experts in operating Apache Kafka. Amazon MSK operates, maintains, and scales Apache Kafka clusters, provides enterprise-grade security features out of the box, and has built-in AWS integrations that accelerate development of streaming data applications. To get started, you can migrate existing Apache Kafka workloads and Apache Kafka Connect connectors into Amazon MSK, or with a few clicks, you can build new ones from scratch. There are no data transfer charges for in-cluster traffic, and no commitments or upfront payments required. You only pay for the resources that you use.
Apache Kafka is an open-source, high performance, fault-tolerant, and scalable platform for building real-time streaming data pipelines and applications. Apache Kafka is a streaming data store that decouples applications producing streaming data (producers) into its data store from applications consuming streaming data (consumers) from its data store. Organizations use Apache Kafka as a data source for applications that continuously analyze and react to streaming data. Learn more about Apache Kafka.
Apache Kafka Connect, an open-source component of Apache Kafka, is a framework for connecting Apache Kafka with external systems, such as databases, key-value stores, search indexes, and file systems.
Streaming data is a continuous stream of small records or events (a record or event is typically a few kilobytes) generated by thousands of machines, devices, websites, and applications. Streaming data includes a wide variety of data such as log files generated by customers using your mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, geospatial services, security logs, metrics, and telemetry from connected devices or instrumentation in data centers. Streaming data services like Amazon MSK and Amazon Kinesis Data Streams make it easy for you to continuously collect, process, and deliver streaming data. Learn more about streaming data.
- Apache Kafka stores streaming data in a fault-tolerant way, providing a buffer between producers and consumers. It stores events as a continuous series of records and preserves the order in which the records were produced.
- Apache Kafka allows many data producers—e.g. websites, internet of things (IoT) devices, Amazon Elastic Compute Cloud (Amazon EC2) instances—to continuously publish streaming data and categorize it using Apache Kafka topics. Multiple data consumers (e.g. machine learning applications, AWS Lambda functions, or microservices) read from these topics at their own rate, similar to a message queue or enterprise messaging system.
- Data consumers process data from Apache Kafka topics on a first-in-first-out basis, preserving the order data was produced.
Apache Kafka supports real-time applications that transform, deliver, and react to streaming data, and can be used to build real-time streaming data pipelines that reliably send data between multiple systems or applications.
With a few clicks in the console, you can create an Amazon MSK cluster. From there, Amazon MSK replaces unhealthy brokers, automatically replicates data for high availability, manages Apache ZooKeeper nodes, automatically deploys hardware patches as needed, manages the integrations with AWS services, makes important metrics visible through the console, and supports Apache Kafka version upgrades so you can take advantage of improvements to the open-source version of Apache Kafka.
For supported Apache Kafka versions, see the Amazon MSK documentation.
Yes, all data plane and admin APIs are natively supported by Amazon MSK.
Yes.
Yes, Apache Kafka clients can use the AWS Glue Schema Registry, a serverless feature of AWS Glue, at no additional charge. Visit the Schema Registry user documentation to get started and to learn more.
Amazon MSK now supports Graviton 3 based M7g instances from “large” through “16xlarge” sizes to run all Apache Kafka workloads. Graviton instances come with the same availability and durability benefits of MSK, with up to 24% lower costs compared to corresponding M5 instances. Graviton instances provide up to 29% higher throughput per instance compared to MSK’s M5 instances, which enables customers to run MSK clusters with a fewer brokers or smaller sized instances.
MSK Serverless
Q: What is MSK Serverless?
MSK Serverless is a cluster type for Amazon MSK that makes it easy for you to run Apache Kafka clusters without having to manage compute and storage capacity. With MSK Serverless, you can run your applications without having to provision, configure, or optimize clusters, and you pay for the data volume you stream and retain.
Q: Does MSK Serverless automatically balance partitions within a cluster?
Yes. MSK Serverless fully manages partitions, including monitoring and moving them to even load across a cluster.
Q: How much data throughput capacity does MSK Serverless support?
MSK Serverless provides up to 200 MBps of write capacity and 400 MBps of read capacity per cluster. Additionally, to ensure sufficient throughput availability for all partitions in a cluster, MSK Serverless allocates up to 5 MBps of instant write capacity and 10 MBps of instant read capacity per partition.
Q: What security features does MSK Serverless offer?
MSK Serverless encrypts all traffic in transit and all data at rest using service-managed keys issued through AWS Key Management Service (KMS). Clients connect to MSK Serverless over a private connection using AWS PrivateLink without exposing your traffic to the public internet. Additionally, MSK Serverless offers IAM Access Control, which you can use to manage client authentication and client authorization to Apache Kafka resources such as topics.
Q: How can producers and consumers access my MSK Serverless clusters?
When you create a MSK Serverless cluster, you provide subnets of one or more Amazon Virtual Private Clouds (VPCs) that host the clients of the cluster. Clients hosted in any of these VPCs will be able to connect to the MSK Serverless cluster using its broker bootstrap string.
Q: Which regions is MSK Serverless available in?
Please refer to the MSK pricing page for up-to-date regional availability.
Q: Which authentication types does MSK Serverless support?
MSK Serverless currently supports AWS IAM for client authentication and authorization. Your clients can assume an AWS IAM role for authentication, and you can enforce access control using an associated IAM policy.
Q: How do I process data in my MSK Serverless cluster?
You can use any Apache Kafka compatible tools to process data in your MSK Serverless cluster topics. MSK Serverless integrates with Amazon Managed Service for Apache Flink for stateful stream processing and AWS Lambda for event processing. You can also use Apache Kafka Connect sink connectors to send data to any desired destination.
Q: How does MSK Serverless ensure high availability?
When you create a partition, MSK Serverless creates 2 replicas of it and places them in different availability zones. Additionally, MSK serverless automatically detects and recovers failed backend resources to maintain high availability.
Data production and consumption
Q: Can I use Apache Kafka APIs to get data in and out of Apache Kafka?
Yes, Amazon MSK supports the native Apache Kafka producer and consumer APIs. Your application code does not need to change when clients begin to work with clusters within Amazon MSK.
Q: Can I use Apache Kafka Connect, Apache Kafka Streams, or any other ecosystem component of Apache Kafka with Amazon MSK?
Yes, you can use any component that leverages the Apache Kafka producer and consumer APIs, and the Apache Kafka Admin Client. Tools that upload .jar files into Apache Kafka clusters are currently not compatible with Amazon MSK, including Confluent Control Center, Confluent Auto Data Balancer, and Uber uReplicator.
Migrating to Amazon MSK
Yes, you can use third-party tools or open-source tools like MirrorMaker, supported by Apache Kafka, to replicate data from clusters into an Amazon MSK cluster. Here is an Amazon MSK migration lab to help you complete a migration.
Version upgrades
Q: Are Apache Kafka version upgrades supported?
Yes, Amazon MSK supports fully managed in-place Apache Kafka version upgrades for provisioned clusters. To learn more about upgrading your Apache Kafka version and high availability best practices, see the version upgrades documentation.
Clusters
You can create your first cluster with a few clicks in the AWS management console or using the AWS SDKs. First, in the Amazon MSK console select an AWS region to create an Amazon MSK cluster in. Choose a name for your cluster, the Virtual Private Cloud (VPC) you want to run the cluster with, and the subnets for each AZ. If you are creating a provisioned cluster, you will also be able to pick a broker instance type, quantity of brokers per AZ, and storage per broker.
Provisioned clusters contain broker instances, provisioned storage, and abstracted Apache ZooKeeper nodes. Serverless clusters are a resource in of themselves, which abstract away all underlying resources.
For provisioned clusters, you can choose EC2 T3.small or instances within the EC2 M7g and M5 instance families. For serverless clusters, brokers are completely abstracted.
No, not at this time.
No, each broker you provision includes boot volume storage managed by the Amazon MSK service.
Some resources, like elastic network interfaces (ENIs), will show up in your Amazon EC2 account. Other Amazon MSK resources will not show up in your EC2 account as these are managed by the Amazon MSK service.
For provisioned clusters, you need to provision broker instances and broker storage with every cluster you create. You may optionally provision storage throughput for storage volumes, which can be used to seamlessly scale I/O without having to provision additional brokers. You do not need to provision Apache ZooKeeper nodes as these resources are included at no additional charge with each cluster you create. For serverless clusters, you just create a cluster as a resource.
Unless otherwise specified, Amazon MSK uses the same defaults specified by the open-source version of Apache Kafka. The default settings for both cluster types are documented here.
Q: Can I change the default broker configurations or upload a cluster configuration to Amazon MSK?
Yes, Amazon MSK allows you to create custom configurations and apply them to new and existing clusters. For more information on custom configurations, see the configuration documentation.
Q: What configuration properties am I able to customize?
The configurations properties that you can customize are documented here.
Q: What is the default configuration of a new topic?
Amazon MSK uses Apache Kafka’s default configuration unless otherwise specified here.
Topics
Once your Apache Kafka cluster has been created, you can create topics using the Apache Kafka APIs. All topic and partition level actions and configurations are performed using Apache Kafka APIs. The following command is an example of creating a topic using Apache Kafka APIs and the configuration details available for your cluster:
bin/kafka-topics.sh --create —bootstrap-server <BootstrapBrokerString> --replication-factor 3 --partitions 1 --topic TopicName
Networking
Q: Does Amazon MSK run in an Amazon VPC?
Yes, Amazon MSK always runs within an Amazon VPC managed by the Amazon MSK service. Amazon MSK resources will be available to your own Amazon VPC, subnet, and security group you select when the cluster is setup. IP addresses from your VPC are attached to your Amazon MSK resources through elastic network interfaces (ENIs), and all network traffic stays within the AWS network and is not accessible to the internet by default.
Q: How will the brokers in my Amazon MSK cluster be made accessible to clients within my VPC?
The brokers in your cluster will be made accessible to clients in your VPC through ENIs appearing in your account. The Security Groups on the ENIs will dictate the source and type of ingress and egress traffic allowed on your brokers.
Q: Is it possible to connect to my cluster over the public Internet?
Yes, Amazon MSK offers an option to securely connect to the brokers of Amazon MSK clusters running Apache Kafka 2.6.0 or later versions over the internet. By enabling public access, authorized clients external to a private Amazon Virtual Private Cloud (VPC) can stream encrypted data in and out of specific Amazon MSK clusters. You can enable public access for MSK clusters after a cluster has been created at no additional cost, but standard AWS data transfer costs for cluster ingress and egress apply. To learn more about turning on public access, see the public access documentation.
Q: Is the connection between my clients and an Amazon MSK cluster private?
By default, the only way data can be produced and consumed from an Amazon MSK cluster is over a private connection between your clients in your VPC and the Amazon MSK cluster. However, if you turn on public access for your Amazon MSK cluster and connect to your MSK cluster using the public bootstrap-brokers string, the connection, though authenticated, authorized and encrypted, is no longer considered private. We recommend that you configure the cluster's security groups to have inbound TCP rules that allow public access from your trusted IP address and make these rules as restrictive as possible if you turn on public access.
Connecting to the VPC
Question: How do I connect to my Amazon MSK cluster over the internet?
The easiest way is to turn on public connectivity over the internet to the brokers of MSK clusters running Apache Kafka 2.6.0 or later versions. For security reasons, you can't turn on public access while creating an MSK cluster. However, you can update an existing cluster to make it publicly accessible. You can also create a new cluster and then update it to make it publicly accessible. To learn more about turning on public access, see the public access documentation.
Question: How do I connect to my Amazon MSK cluster from from inside AWS network but outside the cluster’s Amazon VPC?
You can connect to your MSK cluster from any VPC or AWS account different than your MSK cluster’s by turning on the multi-VPC private connectivity for MSK clusters running Apache Kafka versions 2.7.1. or later versions. You can only turn on private connectivity after cluster creation for any of the supported authentication schemes (IAM authentication, SASL SCRAM and mTLS authentication). You should configure your clients to connect privately to the cluster using Amazon MSK managed VPC connections that uses AWS PrivateLink technology to enable private connectivity. To learn more about setting up private connectivity, see Access from within AWS documentation.
Encryption
Q: Can I encrypt data in my Amazon MSK cluster?
Yes, Amazon MSK uses Amazon Elastic Block Store (Amazon EBS) server-side encryption and AWS Key Management Service (AWS KMS) keys to encrypt storage volumes.
Q: Is data encrypted in transit between brokers within an Amazon MSK cluster?
Yes, by default new clusters have encryption in-transit enabled via TLS for inter-broker communication. For provisioned clusters, you can opt out of using encryption in transit when a cluster is created.
Q: Is data encrypted in transit between my Apache Kafka clients and the Amazon MSK service?
Yes, by default in-transit encryption is set to TLS only for clusters created from the CLI or AWS Management Console. Additional configuration is required for clients to communicate with clusters using TLS encryption. For provisioned clusters, you can change the default encryption setting by selecting the TLS/plaintext or plaintext settings. Read more about MSK Encryption.
Q: Is data encrypted in transit as it moves between brokers and Apache ZooKeeper nodes in an Amazon MSK cluster?
Yes, Amazon MSK clusters running Apache Kafka version 2.5.1 or greater support TLS in-transit encryption between Apache Kafka brokers and ZooKeeper nodes.
Access Management
Q: How do I control cluster authentication and Apache Kafka API authorization?
For serverless clusters, you can use IAM Access Control for both authentication and authorization. For provisioned clusters, you have three options: 1) AWS Identity and Access Management (IAM) Access Control for both AuthN/Z (recommended), 2) TLS certificate authentication (CA) for AuthN and access control lists for AuthZ, and 3) SASL/SCRAM for AuthN and access control lists for AuthZ. Amazon MSK recommends using IAM Access Control. It is the easiest to use and, because it defaults to least privilege access, the most secure option.
Q: How does authorization work in Amazon MSK?
If you are using IAM Access Control, Amazon MSK uses the policies you write and its own authorizer to authorize actions. If you are using TLS certificate authentication or SASL/SCRAM, Apache Kafka uses access control lists (ACLs) for authorization. To enable ACLs you must enable client authentication using either TLS certificates or SASL/SCRAM.
Q: How can I authenticate and authorize a client at the same time?
If you are using IAM Access Control, Amazon MSK will authenticate and authorize for you without any additional set up. If you are using TLS authentication, you can use the Dname of clients TLS certificates as the principal of the ACL to authorize client requests. If you are using SASL/SCRAM, you can use the username as the principal of the ACL to authorize client requests.
Q: How do I control service API actions?
You can control service API actions using AWS Identity and Access Management (IAM).
Q: Can I enable IAM Access Control for an existing cluster?
Yes, you can enable IAM Access Control for an existing cluster from the AWS console or by using the UpdateSecurity API.
Q: Can I use IAM Access Control outside of Amazon MSK?
No, IAM Access Control is only available for Amazon MSK clusters.
Q: How do I provide cross-account access permissions to a Apache Kafka client in an AWS account different from Amazon MSK’s to connect privately to my Amazon MSK cluster?
You can attach a cluster policy to your Amazon MSK cluster to provide your cross-account Apache Kafka client permissions to set up private connectivity to your Amazon MSK cluster. When using IAM client authentication, you can also use the cluster policy to granularly define the Apache Kafka data plane permissions for the connecting client. To learn more about cluster policies, see the cluster policy documentation.
Monitoring, metrics, logging, and tagging
You can monitor the performance of your clusters using the Amazon MSK console, Amazon CloudWatch console, or via JMX and host metrics using Open Monitoring with Prometheus, an open-source monitoring solution.
Q: What is the cost for the different CloudWatch monitoring levels?
The cost of monitoring your cluster using Amazon CloudWatch is dependent on the monitoring level and the size of your Apache Kafka cluster. Amazon CloudWatch charges per metric per month and includes a free tier; see Amazon CloudWatch pricing for more information. For details on the number of metrics exposed for each monitoring level, see Amazon MSK monitoring documentation.
Q: What monitoring tools are compatible with Open Monitoring with Prometheus?
Tools that are designed to read from Prometheus exporters are compatible with Open Monitoring, like: Datadog, Lenses, New Relic, Sumo Logic, or a Prometheus server. For details on Open Monitoring, see Amazon MSK Open Monitoring documentation.
Q: How do I monitor the health and performance of clients?
You can use any client-side monitoring supported by the Apache Kafka version you are using.
Q: Can I tag Amazon MSK resources?
Yes, you can tag Amazon MSK clusters from the AWS Command Line Interface (AWS CLI) or Console.
Q: How do I monitor consumer lag?
Topic-level consumer lag metrics are available as part of the default set of metrics that Amazon MSK publishes to Amazon CloudWatch for all clusters. No additional setup is required to get these metrics. For provisioned clusters, you can also get partition-level consumer lag metrics (partition dimension). To do so, enable enhanced monitoring (PER_PARTITION_PER_TOPIC) on your cluster. Alternatively, you can enable Open Monitoring on your cluster, and use a Prometheus server, to capture partition level metrics from the brokers in your cluster. Consumer lag metrics are available at port 11001, just as other Apache Kafka metrics.
Q: How much does it cost to publish the consumer lag metric to Amazon CloudWatch?
Topic level metrics are included in the default set of Amazon MSK metrics, which are free of charge. Partition level metrics are charged as per Amazon CloudWatch pricing.
Q: How do I access Apache Kafka broker Logs?
You can enable broker log delivery for provisioned clusters. You can deliver broker logs to Amazon CloudWatch Logs, Amazon Simple Storage Service (S3), and Amazon Kinesis Data Firehose. Kinesis Data Firehose supports Amazon OpenSearch Service among other destinations. To learn how to enable this feature, see the Amazon MSK Logging Documentation. To learn about pricing, refer to CloudWatch Logs and Kinesis Data Firehose pricing pages.
Q: What is the logging level for broker logs?
Amazon MSK provides INFO level logs for all brokers within a provisioned cluster.
Q: How do I access Apache ZooKeeper Logs?
You can request Apache ZooKeeper logs through a support ticket.
Q: Can I log the use of Apache Kafka resource APIs, like create topic?
Yes, if you use IAM Access Control, the use of Apache Kafka resource APIs is logged to AWS CloudTrail.
Metadata Management
Q: What is Apache ZooKeeper?
From https://zookeeper.apache.org: “Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications,” including Apache Kafka.
Q: Does Amazon MSK use Apache ZooKeeper?
Yes, Amazon MSK uses Apache ZooKeeper for metadata management. Additionally, starting from Apache Kafka version 3.7, you can create clusters in either ZooKeeper mode or in KRaft mode. A cluster created with KRaft mode uses KRaft controllers for metadata management instead of ZooKeeper nodes.
Q: What is KRaft
KRaft (Apache Kafka Raft) is the consensus protocol that shifts metadata management in Apache Kafka clusters from external Apache ZooKeeper nodes to a group of controllers within Apache Kafka. This change allows metadata to be stored and replicated as topics within Apache Kafka brokers, resulting in faster propagation of metadata. Refer to our developer guide.
Q: Are there any API changes required to use KRaft mode on Amazon MSK vs ZooKeeper mode?
There are no API changes required to use KRaft mode on Amazon MSK. However, If your clients still use the the --zookeeper connection string today, you should update your clients to use the --bootstrap-server connection string to connect to your cluster and perform admin actions. The --zookeeper flag is deprecated in Apache Kafka version 2.5 and is removed starting with Apache Kafka 3.0. We therefore recommend you use recent Apache Kafka client versions and the --bootstrap-server connection string.
Q: I have tools that connect to ZooKeeper; how will these work for KRaft clusters without ZooKeeper?
You should check that any tools you use are capable of using Apache Kafka Admin APIs without ZooKeeper connections. For example, see our updated documentation on using Cruise Control for KRaft mode clusters. Cruise Control has also published steps to follow to run Apache Kafka without ZooKeeper connection.
Q: Can I host more partitions per broker on KRaft based clusters than ZooKeeper based clusters?
The number of partitions per broker is the same on KRaft and ZooKeeper based clusters. However, KRaft allows you to host more partitions per cluster by provisioning more brokers in a cluster.
Integrations
- Amazon S3 using Kinesis Data Firehose for delivering data to Amazon S3 form MSK in a easy and no-code manner.
- Amazon Virtual Private Cloud (Amazon VPC) for network isolation and security
- Amazon CloudWatch for metrics
- AWS Key Management Service (AWS KMS) for storage volume encryption
- Amazon IAM for authentication and authorization of Apache Kafka and service APIs
- AWS Lambda for MSK event sourcing
- AWS IoT for IoT event sourcing
- AWS Glue Schema Registry for controlling the evolution of schemas used by Apache Kafka applications
- AWS CloudTrail for AWS API logs
- AWS Certificate Manager for Private CAs used for client TLS authentication
- AWS CloudFormation for describing and provisioning Amazon MSK clusters using code
- Amazon Managed Service for Apache Flink for fully managed Apache Flink applications that process streaming data
- Amazon Managed Service for Apache Flink Studio for interactive streaming SQL on Apache Kafka
- AWS Secrets Manager for client credentials used for SASL/SCRAM authentication
- Amazon S3 using Kinesis Data Firehose for delivering data to Amazon S3 form MSK in a easy and no-code manner.
- Amazon VPC for network isolation and security
- Amazon CloudWatch for metrics
- Amazon IAM for authentication and authorization of Apache Kafka and service APIs
- AWS Glue Schema Registry for controlling the evolution of schemas used by Apache Kafka applications
- AWS CloudTrail for AWS API logs
- AWS PrivateLink for private connectivity
Scaling
You can scale up storage in your provisioned clusters using the AWS Management Console or the AWS CLI. You can also use tiered storage to virtually store unlimited data on your cluster without having to add brokers for storage. In serverless clusters, storage is scaled seamlessly based on your usage.
Apache Kafka stores data in files called log segments. As each segment is complete, based on the size configured at cluster or topic level, it is copied to the low-cost storage tier. Data is held in performance optimized storage for a specified retention time, or size, and then deleted. There is a separate time and size limit setting for the low-cost storage, which will be longer than the primary storage tier. If clients request data from segments stored in the low-cost tier, the broker will read the data from it and serve the data in the same way as if it is being served from the primary storage.
Q: How can I automatically expand storage in my cluster?
You can create an auto-scaling storage policy using the AWS Management Console or by creating an AWS Application Auto Scaling policy using the AWS CLI or APIs.
Q: Can I scale the number of brokers in an existing cluster?
Yes. You can choose to increase or decrease the number of brokers for provisioned Amazon MSK clusters.
Q: Can I scale the broker size in an existing cluster?
Yes. You can choose to scale to a smaller or larger broker type on your provisioned Amazon MSK clusters.
Q: How do I balance partitions across brokers?
You can use Cruise Control for automatically rebalancing partitions to manage I/O heat. See the Cruise Control documentation for more information. Alternatively, you can use the Apache Kafka Admin API kafka-reassign-partitions.sh to reassign partitions across brokers. In serverless clusters, Amazon MSK automatically balances partitions.
Pricing and availability
Q: How does Amazon MSK pricing work?
Pricing depends on the resources you create. You can learn more by visiting our pricing page.
Q: Do I pay for data transfer as a result of data replication?
No, all in-cluster data transfer is included with the service at no additional charge.
Q: What AWS regions offer Amazon MSK?
Amazon MSK region availability is documented here.
Q: How does data transfer pricing work?
With provisioned clusters, you will pay standard AWS data transfer charges for data transferred in and out of an Amazon MSK cluster. You will not be charged for data transfer within the cluster in a region, including data transfer between brokers and data transfer between brokers and Apache ZooKeeper nodes.
With serverless clusters, you will pay standard AWS data transfer charges for data transferred to or from another region and for data transferred out to the public internet.
Compliance
- HIPAA eligible
- PCI
- ISO
- SOC 1,2,3
For a complete list of AWS services and compliance programs, please see AWS Services in Scope by Compliance Program.
Service Level Agreement
Replication
Q: What is Amazon MSK Replicator?
Amazon MSK Replicator is a feature of Amazon MSK that enables customers to reliably replicate data across Amazon MSK clusters in different AWS regions (cross-region replication) or within the same AWS region (same-region replication), without writing code or managing infrastructure. You can use cross-region replication (CRR) to build highly available and fault-tolerant multi-region streaming applications for increased resiliency. You can also use CRR to provide lower latency access to consumers in different geographic regions. You can use SRR to distribute data from one cluster to many clusters for sharing data with your partners and teams. You can also use SRR or CRR to aggregate data from multiple clusters into one for analytics.
Q: How do I use MSK Replicator?
To setup replication between a pair of source and target MSK clusters, you need to create a Replicator in the destination AWS region. To create a Replicator, you specify details that include the Amazon Resource Name (ARN) of the source and target MSK clusters and an AWS Identity and Access Management (IAM) role that MSK Replicator can use to access the clusters. You will need to create the target MSK cluster if it does not already exist. You also have the option to configure additional settings including topic name configuration and starting position of the Replicator.
Q: Which type of Kafka clusters are supported by MSK Replicator?
MSK Replicator supports replication across MSK clusters only. Both Provisioned and Serverless type of MSK clusters are supported. You can also use MSK Replicator to move from Provisioned to Serverless or vice-versa. Other Kafka clusters are not supported.
Q: Can I specify which topics I want to replicate?
Yes, you can specify which topics you want to replicate using allow and deny lists while creating the Replicator.
Q: Does MSK Replicator replicate topic settings and consumer group offsets?
Yes. MSK Replicator automatically replicates the necessary Kafka metadata like topic configuration, Access Control Lists (ACLs), and consumer group offsets so that consuming applications can resume processing seamlessly after failover. You can choose to turn off one or more of these settings if you only want to replicate the data. You can also specify which consumer groups you want to replicate using allow or deny lists while creating the Replicator.
Q: Do I need to scale the replication when my ingress throughput changes?
No, MSK Replicator automatically deploys, provisions and scales the underlying replication infrastructure to support changes in your ingress throughput.
Q: Can I replicate data across MSK clusters in different AWS accountS?
No, MSK Replicator only supports replication across MSK clusters in the same AWS account.
Q: How can I monitor the replication?
You can use Amazon CloudWatch in the destination region to view metrics for “ReplicationLatency, MessageLag, and ReplicatorThroughput” at a topic and aggregate level for each Replicator at no additional charge. Metrics would be visible under ReplicatorName in “AWS/Kafka” namespace. You can also see the “ReplicatorFailure, AuthError and ThrottleTime” metrics to check if your Replicator is running into any issues.
Q: Can I use MSK Replicator to replicate data from one cluster to multiple clusters or replicate data from many clusters to one?
Yes. You simply need to create a different Replicator for each source and target cluster pair.
Q: How does MSK Replicator connect to the source and target MSK clusters?
MSK Replicator uses IAM Access Control to connect to your source and target clusters. You need to turn on your source and target MSK clusters for IAM Access Control for creating a Replicator. You can continue to use other authentication methods including SASL/SCRAM and mTLS at the same time for your clients since Amazon MSK supports multiple authentication methods simultaneously.
Q: How much replication latency should I expect with MSK Replicator?
MSK Replicator replicates data asynchronously. Replication latency varies based on many factors including the network distance between the AWS regions of your MSK clusters, your source and target clusters’ throughput capacity and the number of partitions on your source and target clusters. If you are experiencing high latency, follow our Troubleshooting guide.
Q: Can I keep topic names same with MSK Replicator?
Yes. MSK Replicator allows you to select a topic name configuration amongst “Prefix” and “Identical” while creating a new Replicator. By default, MSK Replicator creates new topics in the target cluster with an auto-generated prefix added to the topic name. For instance, MSK Replicator will replicate data in “topic” from the source cluster to a new topic in target cluster called “<sourceKafkaClusterAlias>.topic”. You can find the prefix that will be added to the topic names in the target cluster under “sourceKafkaClusterAlias“ field using DescribeReplicator API or the Replicator details page on the MSK Console. If you want to replicate topics to the target cluster while preserving the original names, simply set the topic name configuration to “Identical”.
Q. Why should I use “Identical” topic name configuration?
You should use Identical topic name configuration to simplify building highly available, resilient streaming data architectures across AWS regions. By replicating Kafka topics to other Amazon MSK clusters while preserving the original topic names, this configuration reduces the need to reconfigure client applications during replication setup or failover events. This makes it easier to establish active-passive cluster configurations spanning multiple regions for increased resiliency and availability. It also streamlines use-cases like data aggregation across clusters, migrations between MSK clusters, and data distribution to multiple geographies. You should not use this configuration if your clients can not process data when there is additional metadata added to your Kafka record headers.
Q. Is there a risk of infinite replication loops with “Identical” topic name configuration?
No. MSK Replicator automatically prevents records from being replicated back to the Kafka topic they originated from, thus avoiding infinite replication loops. To achieve this, as part of the replication, MSK Replicator adds metadata to the headers of your records.
Q. Can I update my existing Replicator to use Identical topic name configuration?
No. Topic name configuration cannot be changed after a Replicator has been created.
Q: How can I use replication to increase the resiliency of my streaming application across regions?
You can use MSK Replicator to setup active-active or active-passive cluster topologies to increase resiliency of your Kafka application across regions. In an active-active setup, both MSK clusters are actively serving reads and writes. Comparatively, in an active-passive setup only one MSK cluster at a time is actively serving streaming data, while the other cluster is on standby. We recommend that you use “Identical” topic name configuration for active-passive setup and “Prefix” configuration for active-active setup. However, using “Prefix” configuration will require you to reconfigure your consumers to read the replicated topics. If you want avoid reconfiguring your clients, you can use “Identical” configuration for active-active setup as well. However, you will pay additional data processing and data transfer charges for each Replicator, as each Replicator will need to process twice the usual amount of data, once for replication and another for preventing infinite loops.
Q. Which Kafka versions support Identical topic name configuration?
It is supported for all MSK clusters running on Kafka versions 2.8.1 and above.
Q: Can I replicate existing data on the source cluster?
Yes. By default, when you create a new Replicator, it starts replicating data from the tip of the stream (latest offset) on the source cluster. Alternatively, if you want to replicate existing data, you can configure a new Replicator to start replicating data from the earliest offset in the source cluster topic partitions.
Q: Can replication result in throttling consumers on the source cluster?
Since MSK Replicator acts as a consumer for your source cluster, it is possible that replication causes other consumers to be throttled on your source cluster. This depends on how much read capacity you have on your source cluster and throughput of the data you are replicating. We recommend that your provision identical capacity for your source and target clusters and account for the replication throughput while calculating how much capacity you need. You can also set Kafka quotas for the Replicator on your source and target clusters to control how much capacity the Replicator can use.
Q: Can I compress data before writing to the target cluster?
Yes, you can specify your choice of compression codec while creating the Replicator amongst None, GZIP, Snappy, LZ4 and ZSTD.
Get started with Amazon MSK
Visit the Amazon MSK pricing page.
Learn how to set up your Apache Kafka cluster on Amazon MSK in this step-by-step guide.
Start running your Apache Kafka cluster on Amazon MSK. Log in to the Amazon MSK console.