Apache BookKeeper logo

Apache BookKeeper

Apache BookKeeper is a scalable, fault-tolerant, and low-latency storage service optimized for real-time workloads developed by the Apache Software Foundation. It provides a simple log-oriented storage abstraction called ledgers for reliable, replicated storage of sequential data. BookKeeper is used as the durable log storage layer in Apache Pulsar and other distributed messaging and stream processing systems. It provides a Java client API and an HTTP Admin REST API for cluster management, bookie monitoring, and auto-recovery operations.

2 APIs 5 Capabilities 8 Features 51.3 / 100 developing
ApacheDistributed SystemsLog StorageOpen SourceStorageStreaming

API Rating

51.3/ 100
developing
Scored 2026-05-20 · rubric v0.3
Discoverability80.0
Contract Quality64.1
Governance47.4
Operational Transparency52.6
Developer Ergonomics23.9
Commercial Clarity50.0

APIs

Apache BookKeeper Admin API

The Apache BookKeeper HTTP Admin API provides REST endpoints for managing and monitoring BookKeeper clusters, bookies, ledgers, and auto-recovery operations. It enables programm...

Apache BookKeeper Java Client API

The BookKeeper Java client API provides programmatic access for creating, writing, reading, and managing ledgers. It supports both the legacy LedgerHandle API and the newer Ledg...

Capabilities

Apache BookKeeper Admin API — Auto Recovery

Apache BookKeeper Admin API — Auto Recovery. 4 operations. Lead operation: Apache BookKeeper Decommission Bookie. Self-contained Naftiko capability covering one Apache Bookkeepe...

Run with Naftiko

Apache BookKeeper Admin API — Bookies

Apache BookKeeper Admin API — Bookies. 7 operations. Lead operation: Apache BookKeeper Get Cluster Info. Self-contained Naftiko capability covering one Apache Bookkeeper busines...

Run with Naftiko

Apache BookKeeper Admin API — Configuration

Apache BookKeeper Admin API — Configuration. 2 operations. Lead operation: Apache BookKeeper Get Server Configuration. Self-contained Naftiko capability covering one Apache Book...

Run with Naftiko

Apache BookKeeper Admin API — Ledgers

Apache BookKeeper Admin API — Ledgers. 4 operations. Lead operation: Apache BookKeeper Delete Ledger. Self-contained Naftiko capability covering one Apache Bookkeeper business s...

Run with Naftiko

Apache BookKeeper Admin API — Monitoring

Apache BookKeeper Admin API — Monitoring. 2 operations. Lead operation: Apache BookKeeper Get Heartbeat Status. Self-contained Naftiko capability covering one Apache Bookkeeper ...

Run with Naftiko

Features

Ledger Storage

Append-only log segments called ledgers provide the foundational storage primitive for reliable sequential data storage.

Ensemble Replication

Data is written to a configurable ensemble of bookies with write quorum and ack quorum parameters for fault tolerance.

Auto-Recovery

Built-in under-replication detection and automatic ledger re-replication when bookie nodes fail.

HTTP Admin API

RESTful HTTP Admin API for managing ledgers, bookies, cluster configuration, and triggering recovery operations.

Metrics Export

Prometheus-format metrics endpoint for monitoring bookie performance and storage utilization.

Auditor Election

ZooKeeper-based leader election for the auditor role responsible for detecting under-replicated ledgers.

Garbage Collection

Configurable garbage collection for reclaiming storage from deleted or expired ledger data.

Journal and Ledger Storage

Separate journal and ledger storage paths optimized for sequential write throughput and random read performance.

Use Cases

Durable Log Storage

Serve as the replicated, durable write-ahead log for Apache Pulsar topics and distributed streaming systems.

Distributed Transaction Logs

Store distributed transaction log segments for systems requiring exactly-once semantics and durable commit records.

Metadata Store

Persist metadata and configuration data for distributed systems requiring consistent, replicated storage.

Stream Processing Storage

Provide low-latency, high-throughput sequential storage for real-time stream processing pipelines.

Cluster Administration

Monitor and manage BookKeeper clusters using the HTTP Admin API for operational visibility and recovery.

Integrations

Apache Pulsar

BookKeeper serves as the durable log storage layer for Apache Pulsar messaging topics.

Apache ZooKeeper

ZooKeeper is used for bookie coordination, auditor election, and cluster metadata management.

Apache Hadoop

BookKeeper can be used with Hadoop ecosystem tools for reliable log storage alongside HDFS.

Prometheus

BookKeeper exports Prometheus-format metrics for cluster monitoring and alerting.

Grafana

Grafana dashboards consume BookKeeper Prometheus metrics for operational visibility.

Semantic Vocabularies

Apache Bookkeeper Context

9 classes · 23 properties

JSON-LD

API Governance Rules

Apache BookKeeper API Rules

12 rules · 3 errors 8 warnings 1 info

SPECTRAL

Resources

👥
GitHubOrganization
GitHubOrganization
👥
GitHubRepository
GitHubRepository
🔗
Documentation
Documentation
🚀
GettingStarted
GettingStarted
💬
Support
Support
📜
TermsOfService
TermsOfService
📄
ChangeLog
ChangeLog
🔗
SpectralRules
SpectralRules
🔗
Vocabulary
Vocabulary

Sources

Raw ↑
aid: apache-bookkeeper
name: Apache BookKeeper
description: Apache BookKeeper is a scalable, fault-tolerant, and low-latency storage service optimized for real-time workloads
  developed by the Apache Software Foundation. It provides a simple log-oriented storage abstraction called ledgers for reliable,
  replicated storage of sequential data. BookKeeper is used as the durable log storage layer in Apache Pulsar and other distributed
  messaging and stream processing systems. It provides a Java client API and an HTTP Admin REST API for cluster management,
  bookie monitoring, and auto-recovery operations.
type: Index
position: Consumer
access: 3rd-Party
image: https://kinlane-productions2.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
- Apache
- Distributed Systems
- Log Storage
- Open Source
- Storage
- Streaming
created: '2026-03-16'
modified: '2026-05-19'
url: https://raw.githubusercontent.com/api-evangelist/apache-bookkeeper/refs/heads/main/apis.yml
specificationVersion: '0.19'
apis:
- aid: apache-bookkeeper:apache-bookkeeper-admin-api
  name: Apache BookKeeper Admin API
  description: The Apache BookKeeper HTTP Admin API provides REST endpoints for managing and monitoring BookKeeper clusters,
    bookies, ledgers, and auto-recovery operations. It enables programmatic cluster administration, ledger inspection, bookie
    health monitoring, and garbage collection management.
  humanURL: https://bookkeeper.apache.org/docs/admin/http
  baseURL: http://localhost:8080
  tags:
  - Administration
  - Cluster Management
  - Monitoring
  properties:
  - type: Documentation
    url: https://bookkeeper.apache.org/docs/admin/http
  - type: OpenAPI
    url: openapi/apache-bookkeeper-admin-openapi.yaml
  - type: GettingStarted
    url: https://bookkeeper.apache.org/docs/getting-started/installation
  - type: NaftikoCapability
    url: capabilities/admin-auto-recovery.yaml
  - type: NaftikoCapability
    url: capabilities/admin-bookies.yaml
  - type: NaftikoCapability
    url: capabilities/admin-configuration.yaml
  - type: NaftikoCapability
    url: capabilities/admin-ledgers.yaml
  - type: NaftikoCapability
    url: capabilities/admin-monitoring.yaml
- aid: apache-bookkeeper:apache-bookkeeper-java-client
  name: Apache BookKeeper Java Client API
  description: The BookKeeper Java client API provides programmatic access for creating, writing, reading, and managing ledgers.
    It supports both the legacy LedgerHandle API and the newer Ledger API with explicit durability guarantees.
  humanURL: https://bookkeeper.apache.org/docs/api/ledger-api
  tags:
  - Java
  - Ledger
  - Storage
  properties:
  - type: Documentation
    url: https://bookkeeper.apache.org/docs/api/ledger-api
  - type: APIReference
    url: https://bookkeeper.apache.org/docs/api/javadoc/
common:
- type: GitHubOrganization
  url: https://github.com/apache
- type: GitHubRepository
  url: https://github.com/apache/bookkeeper
- type: Documentation
  url: https://bookkeeper.apache.org/
- type: GettingStarted
  url: https://bookkeeper.apache.org/docs/getting-started/installation
- type: Support
  url: https://bookkeeper.apache.org/community/mailing-lists
- type: TermsOfService
  url: https://www.apache.org/licenses/
- type: ChangeLog
  url: https://github.com/apache/bookkeeper/releases
- type: SpectralRules
  url: rules/apache-bookkeeper-spectral-rules.yml
- type: Vocabulary
  url: vocabulary/apache-bookkeeper-vocabulary.yaml
- type: Features
  data:
  - name: Ledger Storage
    description: Append-only log segments called ledgers provide the foundational storage primitive for reliable sequential
      data storage.
  - name: Ensemble Replication
    description: Data is written to a configurable ensemble of bookies with write quorum and ack quorum parameters for fault
      tolerance.
  - name: Auto-Recovery
    description: Built-in under-replication detection and automatic ledger re-replication when bookie nodes fail.
  - name: HTTP Admin API
    description: RESTful HTTP Admin API for managing ledgers, bookies, cluster configuration, and triggering recovery operations.
  - name: Metrics Export
    description: Prometheus-format metrics endpoint for monitoring bookie performance and storage utilization.
  - name: Auditor Election
    description: ZooKeeper-based leader election for the auditor role responsible for detecting under-replicated ledgers.
  - name: Garbage Collection
    description: Configurable garbage collection for reclaiming storage from deleted or expired ledger data.
  - name: Journal and Ledger Storage
    description: Separate journal and ledger storage paths optimized for sequential write throughput and random read performance.
- type: UseCases
  data:
  - name: Durable Log Storage
    description: Serve as the replicated, durable write-ahead log for Apache Pulsar topics and distributed streaming systems.
  - name: Distributed Transaction Logs
    description: Store distributed transaction log segments for systems requiring exactly-once semantics and durable commit
      records.
  - name: Metadata Store
    description: Persist metadata and configuration data for distributed systems requiring consistent, replicated storage.
  - name: Stream Processing Storage
    description: Provide low-latency, high-throughput sequential storage for real-time stream processing pipelines.
  - name: Cluster Administration
    description: Monitor and manage BookKeeper clusters using the HTTP Admin API for operational visibility and recovery.
- type: Integrations
  data:
  - name: Apache Pulsar
    description: BookKeeper serves as the durable log storage layer for Apache Pulsar messaging topics.
  - name: Apache ZooKeeper
    description: ZooKeeper is used for bookie coordination, auditor election, and cluster metadata management.
  - name: Apache Hadoop
    description: BookKeeper can be used with Hadoop ecosystem tools for reliable log storage alongside HDFS.
  - name: Prometheus
    description: BookKeeper exports Prometheus-format metrics for cluster monitoring and alerting.
  - name: Grafana
    description: Grafana dashboards consume BookKeeper Prometheus metrics for operational visibility.
maintainers:
- FN: Kin Lane
  email: [email protected]