Linux Format

Basic concepts

-

A messaging system such as Kafka enables you to send messages between processes, applicatio­ns and servers. Applicatio­ns connect to Kafka to send or get data. Strictly speaking, a Kafka ‘topic’ is a unit of storage in Kafka: data in Kafka is stored in topics. A topic is shared, hence you have a topic partition. Therefore, you can think of a Kafka topic-partition tuple as an append log. Simplistic­ally and without considerin­g partitions, you can think of a Kafka topic as a table in a relational database.

A Kafka ‘broker’ is a Kafka server. Many Kafka brokers create a Kafka cluster. A Kafka ‘producer’ is a program that writes data to a Kafka topic, whereas a Kafka ‘consumer’ is a program that reads data from Kafka. A ‘partition’ is the smaller part of a Kafka topic. A Kafka topic consists of one or more partitions – this mainly depends on the amount of data that you have to deal with. It is better for topics with huge amounts of data to have multiple partitions. Among other things, this enables multiple consumers to read from the same topic in parallel. Additional­ly, a topic can be split across multiple brokers at the partition level. Replicatio­n in Kafka happens at the partition level. Each record in a partition is separated from the other records using an offset. As Kafka knows nothing about the format of a record, the offset is really important informatio­n. Last, Zookeeper (https:// zookeeper.apache.org) is used for doing the housekeepi­ng of Kafka by containing configurat­ion, naming and synchronis­ation informatio­n. This means that you cannot run Kafka without Zookeeper.

Newspapers in English

Newspapers from Australia