Kafka Introduction

Harsh Chauhan
2 min readOct 6, 2021

--

Introduction

What is Kafka?

Kafka is the distributed messaging streaming platform that uses publish and subscribe mechanism to stream records.

Messaging System?

A system that is responsible to transfer the data from one application to another, excluding the dependency of data schema.

There are 2 types of messaging systems:

  1. Point to Point System:
  • Once the sender sends the message Receiver get a notification
  • The message is deleted, once receiver reads the message
  • Message can be consumed by a maximum of one receiver

2. Publish-Subscribe Messaging System

  • Message can be consumed by as many subscribers
  • Their is particular time limit to read the message
  • Kafka follows the pub-sub messaging system
  • When Subscriber reads the message then publisher didn’t get any notification

Terminology

Topics:

  1. These are the set of the same data present in the broker
  2. The name of each topic should be unique
  3. Create as many topics as you want

Partitions

  1. In topics, we have different partition
  2. Each msg within the partition have a unique ID associated known as Offset

Replication

  1. Replicas are the backup of partitions
  2. We cant read and write in replicas
  3. The Replicas of each partition are found in different brokers

Producers

These are the application that produces data or publish data into the topics within the cluster using producing API’s

  1. Producers can produce data into all the partitions or into specific partitions depends on config.

Consumers

An application that reads data from topics within the cluster using consuming API’s.

  1. Read data either from the topic level or specific partitions.

Brokers

  1. Brokers are software processes that manage the topics and published messages.
  2. Also, know as Kafka servers
  3. Brokers also manage the metadata and consumer offset. Which is responsible to deliver the message to the right consumer.

Zookeeper

  1. Used to monitor Kafka cluster and coordinate with brokers
  2. keeps all metadata related to Kafka cluster in the form of key-value pair
  3. Used for controller(master broker) election within Kafka cluster
  4. Metadata Includes:
  • Configuration Information
  • Health Status

--

--