Friday, March 7, 2025

Kafka

  • What is Apache Kafka?

    • A distributed streaming platform for building real-time data pipelines and applications.

  • How is Kafka's messaging system different from other messaging frameworks?

    • It provides high throughput, fault tolerance, and durability with a distributed architecture.

  • Describe Kafka's multiple components.

    • Broker, Zookeeper, Producer, Consumer, Topic, and Partition.

  • What is an offset in Kafka?

    • A unique identifier assigned to messages within a partition.

  • Define a consumer group in Kafka.

    • A group of consumers that coordinate to read from Kafka topics.

  • What is the importance of Zookeeper in Kafka?

    • It manages and coordinates Kafka brokers and tracks metadata.

  • Can Kafka be used without Zookeeper?

    • No, Zookeeper is essential for Kafka's operation.

  • What are the advantages of Kafka?

    • High throughput, scalability, fault tolerance, and real-time processing.

  • What is a Kafka topic?

    • A logical channel to which producers send messages and consumers read them.

  • Explain the role of the Kafka Producer API.

    • It allows applications to publish messages to Kafka topics.

  • What is a Kafka broker?

    • A server that stores and serves messages to Kafka consumers.

  • Describe the function of the offset.

    • Tracks the position of a consumer in a partition.

  • What is a Queue-Full Exception in Kafka?

    • An error when the producer cannot send messages due to lack of space.

  • How does Kafka define the terms "leader" and "follower"?

    • Leaders handle all reads and writes for a partition, followers replicate leaders.

  • What is an In-Sync Replica (ISR)?

    • Replicas of a partition that are up-to-date with the leader.

  • How does Kafka handle message retention?

    • It retains messages based on configurable time or size policies.

  • What is log compaction in Kafka?

    • A process that keeps the latest updates of each record key.

  • Explain Kafka's partitioning strategy.

    • Distributes messages across partitions based on a key or round-robin.

  • How does Kafka ensure data durability?

    • By replicating partitions across multiple brokers.

  • What is the role of Kafka Connect?

    • A framework to stream data between Kafka and other systems.

  • Describe Kafka Streams.

    • A library for building real-time streaming applications.

  • What is the difference between Kafka and traditional message queues?

    • Kafka is designed for high throughput and fault tolerance, and supports stream processing.

  • How does Kafka handle fault tolerance?

    • Through data replication and distributed architecture.

  • What is the role of Kafka's replication factor?

    • Determines how many copies of data are maintained.

  • Explain the concept of Kafka's consumer lag.

    • The delay between message production and consumption.

  • How does Kafka achieve high throughput?

    • By batching messages and minimizing network overhead.

  • What are Kafka's key use cases?

    • Real-time analytics, log aggregation, event sourcing, and stream processing.

  • How does Kafka handle backpressure?

    • By allowing consumers to control their read rate.

  • What is the role of Kafka's log segments?

    • They store a series of records within a partition.

  • How does Kafka integrate with other big data tools?

    • Through connectors and APIs for seamless data transfer.

  • No comments:

    Post a Comment