Apache NiFi
Apache NiFi is a dataflow management system designed to automate the flow of data between systems. It provides a web-based user interface for designing, controlling, and monitoring data flows with real-time operational control, data provenance tracking, and support for hundreds of processors. NiFi Version 2 is the current major version with enhanced security and performance.
APIs
Apache NiFi REST API
The NiFi REST API provides comprehensive JWT-authenticated endpoints for managing processors, connections, controller services, process groups, reporting tasks, provenance, flow...
Apache NiFi Registry
NiFi Registry provides a central location for storage and management of shared NiFi flow resources, enabling versioned flows across NiFi environments. It provides its own REST A...
Apache MiNiFi
MiNiFi is a lightweight agent for edge data collection that is a subproject of NiFi. MiNiFi C++ (nifi-minifi-cpp) provides a small-footprint agent for IoT edge data collection w...
Features
Browser-based drag-and-drop interface for designing, controlling, and monitoring data flows without coding.
Complete lineage tracking of every piece of data that flows through the system from ingestion to destination.
Extensive library of processors for data ingestion, transformation, routing, and delivery to diverse systems and cloud platforms.
Loss-tolerant and guaranteed delivery options with configurable prioritization and backpressure control.
Comprehensive JWT-authenticated REST API for programmatic management of all NiFi resources and operations.
Version control for data flows via NiFi Registry, enabling flow promotion across development, test, and production environments.
Fine-grained multi-tenant authorization with HTTPS, TLS, and SSH support for secure deployments.
Zero-master cluster architecture for high-availability and load-balanced dataflow execution.
Externalize configuration using parameter contexts that can be applied across multiple processors and process groups.
Lightweight MiNiFi agents for edge data collection at IoT endpoints, managed centrally from NiFi.
Use Cases
Build pipelines ingesting data from files, databases, message queues, cloud storage, and APIs into data lakes and warehouses.
Collect and route security telemetry, logs, and threat intelligence feeds for SIEM and analytics platforms.
Deploy MiNiFi agents at IoT edge locations to collect, filter, and forward sensor data to central NiFi clusters.
Build event streaming pipelines consuming from Kafka, Kinesis, and other message brokers for real-time data processing.
Build data preparation and vector database ingestion pipelines for generative AI and RAG applications.
Move and transform data between AWS, Azure, and GCP services with built-in cloud processor libraries.
Integrations
Native ConsumeKafka and PublishKafka processors for streaming data between NiFi and Kafka topics.
PutS3Object and FetchS3Object processors for reading and writing data to AWS S3 buckets.
PutAzureBlobStorage and FetchAzureBlobStorage processors for Azure cloud storage integration.
Native GCS processors for reading and writing Google Cloud Storage objects.
PutMongo and GetMongo processors for reading and writing documents to MongoDB collections.
PutElasticsearchRecord and FetchElasticsearch processors for indexing and querying Elasticsearch.
Native Salesforce processors for querying and publishing data to Salesforce CRM.
ConsumeMQTT and PublishMQTT processors for IoT messaging protocol support.