Amazon MSK Connect

Serverless and auto-scaling Kafka Connect workloads with Amazon MSK

Why MSK Connect?

With Amazon MSK Connect, a feature of Amazon MSK, you can run fully managed Apache Kafka Connect workloads on AWS. This feature makes it easy to deploy, monitor, and automatically scale connectors that move data between Apache Kafka clusters and external systems such as databases, file systems, and search indices. MSK Connect is fully compatible with Kafka Connect, enabling you to lift and shift your Kafka Connect applications with zero code changes. With MSK Connect, you only pay for connectors you are running, without the need for cluster infrastructure.

Benefits

With MSK Connect, there is no need to provision infrastructure or Apache Kafka Connect clusters. The lifecycle of clusters, workers, and connectors is all handled by the service. Monitoring, automatic restarts of failed connectors, and patching of underlying infrastructure is all transparent to the end user, so you can focus on building your streaming data flows to and from Apache Kafka clusters.

Data flows frequently change, with data volumes going up and down from different sources. MSK Connect provides a serverless experience and scales the number of workers up and down, so you don’t have to provision servers or clusters and pay only for what you need to move your streaming data to and from your Apache Kafka cluster. 

MSK Connect is fully compatible with Apache Kafka Connect. This means you can run any connector compatible with Apache Kafka Connect 2.7.1 and above, whether they were developed by one of our partners, by the broad open-source community, or within your own organization. MSK connect allows you to upload any third-party connector plugins and run them on AWS.

FAQs for Kafka Connect

Kafka Connect, an open source component of Apache Kafka, is a framework for connecting Apache Kafka with external systems such as databases, key-value stores, search indexes, and file systems.

A connector integrates external systems including AWS services with Apache Kafka by continuously copying streaming data from a data source into an Apache Kafka topic, or continuously copying data from an Apache Kafka topic into a data sink. A connector may perform lightweight logic such as transformation, format conversion, or filtering data before the data is delivered to a destination. Source connectors pull data from a data source and push this data into Apache Kafka, while sink connectors pull data from Apache Kafka and push this data into a data sink.

MSK Connect runs any connector that implements the Kafka Connect interfaces. There are many sources of connectors, including from our partners, open source projects like Debezium, and from commercial connector vendors like Confluent and lenses.io.

MSK Connect works with Amazon MSK clusters, other Apache Kafka and compatible clusters including self-managed clusters in EC2 or non-AWS environments, subject to Amazon VPC connectivity.