Kafka

Production ready Kafka Connect

Posted on by  
Tim te Beek

Kafka Connect is a great tool for streaming data between your Apache Kafka cluster and other data systems. Getting started with with Kafka Connect is fairly easy; there’s hunderds of connectors avalable to intregrate with data stores, cloud platfoms, other messaging systems and monitoring tools. Setting up a production grade installation is slightly more involved however, with documentation at times scattered across the web.

In this post we’ll set up a complete production grade Kafka Connect installation, highlighting some of the choices and configuration quirks along the way.

For illustrative purposes we will set up a JDBC source connector that publishes any new table rows onto a Kafka Topic, but you can substitute any other source or sink connector. All configuration is available on GitHub.

Continue reading →

Replication of a single Avro serialized Kafka topic from one cluster to another

Posted on by  
Tim te Beek

As more and more teams and companies adopt Apache Kafka, you can find yourself wanting to share data via replication of one or more topics from one cluster to another. While replication of an entire cluster with all of it’s topics as a means of failover can be achieved with tools such as Mirror Maker and Confluent Replicator, for replication of a single topic there are fewer examples. Even more so when the source topic is serialized with Avro, with the schema stored in Confluent Schema Registry.

Here we present a minimal consumer that replicates a single Avro serialized Kafka topic from one cluster to another, while ensuring (only) the necessary Avro schema is registered in the target cluster Schema Registry.

Continue reading →

shadow-left