Apache Druid logo

Apache Druid

Apache Druid is a high-performance, real-time analytics database governed by the Apache Software Foundation, designed for fast slice-and-dice OLAP queries on event-time data. It features a distributed, column-oriented storage engine with automatic rollup, supports both streaming (Kafka, Kinesis) and batch (S3, HDFS, local) data ingestion, and provides a SQL query interface plus a native JSON query API via REST. Druid is optimized for sub-second queries at petabyte scale with high concurrency.

1 APIs 8 Features
AnalyticsApacheDatabaseKafkaOLAPOpen SourceReal-TimeSQLTime Series

APIs

Apache Druid REST API

Druid exposes REST APIs for Druid SQL (POST /druid/v2/sql), native JSON queries (POST /druid/v2), batch and streaming data ingestion tasks, supervisor management for Kafka/Kines...

Features

Sub-Second OLAP Queries

Columnar storage with bitmap indexes, dictionary encoding, and pre-aggregation (rollup) enables sub-second queries on billions of events.

Druid SQL API

REST endpoint for submitting standard SQL queries with ANSI SQL support, time-based filtering, and streaming response options.

Native JSON Query API

Druid-native query format (Timeseries, TopN, GroupBy, Scan, Search) for maximum control and performance.

Streaming Ingestion

Real-time data ingestion from Apache Kafka and Amazon Kinesis with supervisor-managed offset tracking and exactly-once semantics.

Batch Ingestion

Parallel batch indexing tasks from local files, S3, GCS, HDFS, and other external storage systems.

Automatic Rollup

Pre-aggregates metrics at ingestion time to reduce storage and query time, configurable per datasource.

Time-Based Partitioning

All data is partitioned by time interval (segments), enabling efficient time-range query pruning.

Multi-Tenancy

Query isolation and resource management via query lanes, scheduler priorities, and row-level access control.

Use Cases

Real-Time Event Analytics

Analyze click streams, IoT events, application logs, and user behavior data with sub-second query latency.

Business Intelligence Dashboards

Power interactive BI dashboards with high-concurrency low-latency queries backed by Druid's columnar engine.

Network and Security Monitoring

Ingest and analyze network flow data and security events in real time for threat detection and capacity planning.

Ad Tech Analytics

Process advertising impression, click, and conversion events at high volume with real-time aggregation.

Operational Analytics

Monitor application performance metrics and operational data with drilldown and filtering capabilities.

Integrations

Apache Kafka

KafkaSupervisor for real-time continuous ingestion from Kafka topics into Druid datasources.

Amazon Kinesis

KinesisSupervisor for real-time data ingestion from AWS Kinesis data streams.

Apache Hadoop / HDFS

Native Hadoop batch indexing task for bulk loading data from HDFS or MapReduce job outputs.

Amazon S3 / GCS

Batch and streaming ingestion from object storage (S3, GCS, Azure Blob) using index tasks.

Apache Hive

Druid-Hive integration for querying Druid datasources from HiveQL and performing joins.

Kubernetes

Official Kubernetes operator for deploying and managing Druid clusters on Kubernetes.

Imply (Commercial)

Imply provides a commercial managed Druid service with additional features and enterprise support.

Semantic Vocabularies

Apache Druid Context

5 classes · 32 properties

JSON-LD

Resources

🌐
Portal
Portal
🔗
Documentation
Documentation
🚀
GettingStarted
GettingStarted
📰
Blog
Blog
👥
GitHubOrganization
GitHubOrganization
👥
GitHubRepository
GitHubRepository
👥
StackOverflow
StackOverflow
🔗
Vocabulary
Vocabulary

Sources

Raw ↑
aid: apache-druid
name: Apache Druid
description: >-
  Apache Druid is a high-performance, real-time analytics database governed by the Apache Software Foundation, designed
  for fast slice-and-dice OLAP queries on event-time data. It features a distributed, column-oriented storage engine
  with automatic rollup, supports both streaming (Kafka, Kinesis) and batch (S3, HDFS, local) data ingestion, and
  provides a SQL query interface plus a native JSON query API via REST. Druid is optimized for sub-second queries at
  petabyte scale with high concurrency.
type: Index
position: Consumer
access: 3rd-Party
image: https://kinlane-productions2.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
  - Analytics
  - Apache
  - Database
  - Kafka
  - OLAP
  - Open Source
  - Real-Time
  - SQL
  - Time Series
created: '2026-03-16'
modified: '2026-05-19'
url: https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/apis.yml
specificationVersion: '0.19'
apis:
  - aid: apache-druid:apache-druid-rest-api
    name: Apache Druid REST API
    description: >-
      Druid exposes REST APIs for Druid SQL (POST /druid/v2/sql), native JSON queries (POST /druid/v2), batch and
      streaming data ingestion tasks, supervisor management for Kafka/Kinesis ingestion, data segment management,
      coordinator and overlord operations, process status, and dynamic configuration. A JDBC driver is also available
      for SQL access via JDBC clients.
    humanURL: https://druid.apache.org/docs/latest/api-reference/
    tags:
      - Analytics
      - Data Ingestion
      - Datasources
      - JSON Query
      - Kafka
      - OLAP
      - REST
      - SQL
      - Segments
      - Supervisors
    properties:
      - type: Documentation
        url: https://druid.apache.org/docs/latest/api-reference/
      - url: openapi/apache-druid-openapi.yml
        type: OpenAPI
      - type: GettingStarted
        url: https://druid.apache.org/docs/latest/tutorials/tutorial-batch-hadoop
      - type: APIReference
        url: https://druid.apache.org/docs/latest/api-reference/
      - type: GitHubRepository
        url: https://github.com/apache/druid
      - type: Tools
        url: https://github.com/apache/druid-operator
        title: Kubernetes Operator
      - type: SDK
        url: https://mvnrepository.com/artifact/org.apache.druid/druid-sql
        title: JDBC Driver (Maven)
      - type: SDK
        url: https://pypi.org/project/pydruid/
        title: pydruid Python Client
      - type: JSONSchema
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-schema/apache-druid-ingestion-task-schema.json
        title: Ingestion Task
      - type: JSONSchema
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-schema/apache-druid-sql-query-request-schema.json
        title: Sql Query Request
      - type: JSONSchema
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-schema/apache-druid-sql-query-response-schema.json
        title: Sql Query Response
      - type: JSONSchema
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-schema/apache-druid-supervisor-schema.json
        title: Supervisor
      - type: JSONStructure
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-structure/apache-druid-ingestion-task-structure.json
      - type: JSONStructure
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-structure/apache-druid-sql-query-request-structure.json
      - type: JSONStructure
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-structure/apache-druid-sql-query-response-structure.json
      - type: JSONStructure
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-structure/apache-druid-supervisor-structure.json
      - type: JSONLD
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/json-ld/apache-druid-context.jsonld
      - type: Example
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/examples/apache-druid-ingestion-task-example.json
      - type: Example
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/examples/apache-druid-sql-query-request-example.json
      - type: Example
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/examples/apache-druid-sql-query-response-example.json
      - type: Example
        url: >-
          https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/examples/apache-druid-supervisor-example.json
maintainers:
  - FN: Kin Lane
    email: [email protected]
common:
  - type: Portal
    url: https://druid.apache.org/
  - type: Documentation
    url: https://druid.apache.org/docs/latest/
  - type: GettingStarted
    url: https://druid.apache.org/docs/latest/tutorials/
  - type: Blog
    url: https://druid.apache.org/blog/
  - type: GitHubOrganization
    url: https://github.com/apache
  - type: GitHubRepository
    url: https://github.com/apache/druid
  - type: StackOverflow
    url: https://stackoverflow.com/questions/tagged/druid
  - type: Features
    data:
      - name: Sub-Second OLAP Queries
        description: >-
          Columnar storage with bitmap indexes, dictionary encoding, and pre-aggregation (rollup) enables sub-second
          queries on billions of events.
      - name: Druid SQL API
        description: >-
          REST endpoint for submitting standard SQL queries with ANSI SQL support, time-based filtering, and streaming
          response options.
      - name: Native JSON Query API
        description: Druid-native query format (Timeseries, TopN, GroupBy, Scan, Search) for maximum control and performance.
      - name: Streaming Ingestion
        description: >-
          Real-time data ingestion from Apache Kafka and Amazon Kinesis with supervisor-managed offset tracking and
          exactly-once semantics.
      - name: Batch Ingestion
        description: Parallel batch indexing tasks from local files, S3, GCS, HDFS, and other external storage systems.
      - name: Automatic Rollup
        description: Pre-aggregates metrics at ingestion time to reduce storage and query time, configurable per datasource.
      - name: Time-Based Partitioning
        description: All data is partitioned by time interval (segments), enabling efficient time-range query pruning.
      - name: Multi-Tenancy
        description: Query isolation and resource management via query lanes, scheduler priorities, and row-level access control.
  - type: UseCases
    data:
      - name: Real-Time Event Analytics
        description: Analyze click streams, IoT events, application logs, and user behavior data with sub-second query latency.
      - name: Business Intelligence Dashboards
        description: Power interactive BI dashboards with high-concurrency low-latency queries backed by Druid's columnar engine.
      - name: Network and Security Monitoring
        description: >-
          Ingest and analyze network flow data and security events in real time for threat detection and capacity
          planning.
      - name: Ad Tech Analytics
        description: Process advertising impression, click, and conversion events at high volume with real-time aggregation.
      - name: Operational Analytics
        description: Monitor application performance metrics and operational data with drilldown and filtering capabilities.
  - type: Integrations
    data:
      - name: Apache Kafka
        description: KafkaSupervisor for real-time continuous ingestion from Kafka topics into Druid datasources.
      - name: Amazon Kinesis
        description: KinesisSupervisor for real-time data ingestion from AWS Kinesis data streams.
      - name: Apache Hadoop / HDFS
        description: Native Hadoop batch indexing task for bulk loading data from HDFS or MapReduce job outputs.
      - name: Amazon S3 / GCS
        description: Batch and streaming ingestion from object storage (S3, GCS, Azure Blob) using index tasks.
      - name: Apache Hive
        description: Druid-Hive integration for querying Druid datasources from HiveQL and performing joins.
      - name: Kubernetes
        description: Official Kubernetes operator for deploying and managing Druid clusters on Kubernetes.
      - name: Imply (Commercial)
        description: Imply provides a commercial managed Druid service with additional features and enterprise support.
  - type: Vocabulary
    url: >-
      https://raw.githubusercontent.com/api-evangelist/apache-druid/refs/heads/main/vocabulary/apache-druid-vocabulary.yaml