To gain a better understanding of the Importance of In-Sync Replicas (ISR) in Apache Kafka, let’s take a closer look at the replication process within a Kafka broker. Replication involves maintaining multiple copies of data across several brokers. By having identical copies of data on different brokers, we ensure high availability in case of broker failures or unavailability within a multi-node Kafka cluster that serves client requests. Therefore, when creating a topic in a multi-node Kafka cluster, it is essential to specify the replication factor, which determines the number of data copies to maintain. However, on a single-node Kafka cluster, the replication factor must be one. It is possible to modify the replication factor in the future based on the availability of nodes in the cluster.
Single-Node Kafka Cluster
In a single-node Kafka cluster, we can have multiple partitions within a broker, as each topic can be divided into one or more partitions. Partitions represent subdivisions of a topic across all the brokers in the cluster, with each partition containing the actual data in the form of messages. Internally, each partition is treated as a single log file where records are appended. During topic creation, the topic is divided into partitions based on the specified number. This partitioning allows messages to be distributed in parallel among multiple brokers in the cluster, enabling Kafka to scale and handle multiple consumers and producers simultaneously. While more partitions contribute to higher throughput, they also have certain drawbacks. Increasing the number of partitions results in a greater number of file handlers being created, as each partition maps to a directory in the broker’s file system.
Replication in Apache Kafka
In short, replication in Apache Kafka refers to the practice of having multiple copies of data spread across multiple brokers. This ensures high availability in case of broker failures or unavailability within a multi-node Kafka cluster.
Now that we have discussed replication and partitions in Apache Kafka, let’s delve into the concept of In-Sync Replicas (ISR). ISR refers to the replicas of a partition that are “in sync” with the leader. The leader is the replica to which all client and broker requests are directed. The replicas that are not the leader are referred to as followers. An ISR is a follower that is synchronized with the leader. For instance, if the replication factor for a topic is set to 3, Kafka will store the topic’s partition log in three different locations. A record is considered committed only when all three replicas have successfully written it to disk and acknowledged it back to the leader.
Multi-node Kafka Cluster
In a multi-node Kafka cluster, one broker is designated as the leader to serve the other brokers. This leader broker handles all the read and write requests for a partition, while the followers (other brokers) passively replicate the leader to maintain data consistency. Each partition can have only one leader at a time, responsible for all the reads and writes of records within that partition. In the event of leader failure, the followers take over. Kafka uses Apache ZooKeeper internally to select a replica of a broker’s partition. If the leader fails, a new ISR is chosen as the new leader.
When all the ISRs for a partition write to their logs, the record is considered “committed,” and consumers can only read committed records. The minimum in-sync replica count specifies the minimum number of replicas that must be available for the producer to successfully send records to a partition.
Understanding Partitions
- Partitioning in Apache Kafka allows a topic to be divided into multiple sub-divisions called partitions.
- Each partition represents a portion of the topic’s data and is stored in a single log file.
- Partitioning enables parallel message distribution among several brokers in the cluster, facilitating scalability for consumers and producers.
Although having a higher number of minimum in-sync replicas ensures greater persistence, it may also have a detrimental effect on availability. If the minimum number of in-sync replicas is not available during publishing, data availability is reduced. The minimum in-sync replica count determines how many replicas must be available for the producer to successfully send records to a partition.
For example, in a three-node operational Kafka cluster with a minimum in-sync replica configuration of three, if one node goes down or becomes unreachable, the remaining two nodes will be unable to receive any data/messages from the producers. This is because only two in-sync replicas are active and available across the brokers in the cluster. The third replica, which resided on the failed or unavailable broker, cannot send acknowledgment to the leader that it has synchronized with the latest data, unlike the other two live replicas on the available brokers in the cluster.
The post Understanding In-Sync Replicas (ISR) in Apache Kafka appeared first on Datafloq.