Kafka Introduction
2 min readOct 6, 2021
Introduction
What is Kafka?
Kafka is the distributed messaging streaming platform that uses publish and subscribe mechanism to stream records.
Messaging System?
A system that is responsible to transfer the data from one application to another, excluding the dependency of data schema.
There are 2 types of messaging systems:
- Point to Point System:
- Once the sender sends the message Receiver get a notification
- The message is deleted, once receiver reads the message
- Message can be consumed by a maximum of one receiver
2. Publish-Subscribe Messaging System
- Message can be consumed by as many subscribers
- Their is particular time limit to read the message
- Kafka follows the pub-sub messaging system
- When Subscriber reads the message then publisher didn’t get any notification
Terminology
Topics:
- These are the set of the same data present in the broker
- The name of each topic should be unique
- Create as many topics as you want
Partitions
- In topics, we have different partition
- Each msg within the partition have a unique ID associated known as Offset
Replication
- Replicas are the backup of partitions
- We cant read and write in replicas
- The Replicas of each partition are found in different brokers
Producers
These are the application that produces data or publish data into the topics within the cluster using producing API’s
- Producers can produce data into all the partitions or into specific partitions depends on config.
Consumers
An application that reads data from topics within the cluster using consuming API’s.
- Read data either from the topic level or specific partitions.
Brokers
- Brokers are software processes that manage the topics and published messages.
- Also, know as Kafka servers
- Brokers also manage the metadata and consumer offset. Which is responsible to deliver the message to the right consumer.
Zookeeper
- Used to monitor Kafka cluster and coordinate with brokers
- keeps all metadata related to Kafka cluster in the form of key-value pair
- Used for controller(master broker) election within Kafka cluster
- Metadata Includes:
- Configuration Information
- Health Status