Server Monitoring2 min read

    How to Monitor Apache Kafka

    Share

    Overview

    Apache Kafka is a distributed event streaming platform capable of handling trillions of events per day. It powers high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Xitoring's Kafka integration provides comprehensive monitoring of broker health, message throughput, consumer groups, and partition states.

    What Can It Monitor?

    Broker Metrics

    • Active Controllers — Number of active controller brokers
    • Under-Replicated Partitions — Partitions with insufficient replicas
    • Offline Partitions — Partitions without an active leader
    • Broker Count — Total brokers in the cluster

    Throughput Metrics

    • Messages In per Second — Rate of incoming messages
    • Bytes In / Out per Second — Network throughput
    • Requests per Second — Produce and fetch request rates
    • Failed Produce / Fetch Requests — Error rates

    Consumer Metrics

    • Consumer Lag — Messages behind the latest offset per consumer group
    • Consumer Group Count — Active consumer groups

    Resource Metrics

    • CPU Usage — Broker CPU utilization
    • Memory Usage — JVM heap and non-heap memory
    • Disk Usage — Log segment storage consumption
    • Network I/O — Broker network throughput
    • Open File Descriptors — File handles in use

    Prerequisites

    None! There are no special requirements or software dependencies to enable this integration.

    How to Activate the Integration

    Run the Xitogent CLI:

    xitogent integrate
    

    Select Kafka from the list of available integrations. When prompted, provide connection details for your Kafka broker.

    Xitogent tests the connection and completes setup automatically. Within moments, real-time graphs and data appear on your server page.

    Setting Up Triggers

    Available trigger parameters include:

    • Active Controllers / Under-Replicated Partitions / Offline Partitions
    • Messages In per Second / Bytes In / Bytes Out
    • Requests per Second / Failed Requests
    • Consumer Lag
    • CPU / Memory / Disk Usage

    Navigate to Triggers on your server page, select Kafka, choose a metric, set your threshold, and configure notification channels.

    Tips

    • Monitor Under-Replicated Partitions — this is the most critical Kafka health indicator
    • Set alerts on Consumer Lag to detect consumers falling behind
    • Track Offline Partitions to catch leader election failures
    • Watch Disk Usage — Kafka retains logs based on retention policy and can fill disks quickly
    • Monitor Active Controllers — there should always be exactly one
    • Use Failed Produce Requests alerts to catch producer issues before data loss occurs