reducto-ai

Reducto is an AI document-parsing platform that turns unstructured PDFs, images, spreadsheets, slides, and forms into LLM-ready layout, structured data, and form completions. The API exposes Parse, Extract, Split, Edit, Classify, and Pipeline endpoints — each with sync and async variants — plus an Upload API, Webhooks API, and Jobs API. Used by Scale AI, Vanta, Harvey, Medallion, Toast, JLL, Vise, Newfront, and Legora to power document AI in finance, healthcare, insurance, legal, government, and logistics.

10 APIs 9 Capabilities 27 Features

APIs

Reducto Parse API

Parse documents (PDFs, images, spreadsheets, slides, text files) and capture layout, structure, OCR text, tables, figures, equations, lists, and LLM-optimized chunks. Supports a...

Reducto Extract API

Extract structured data from documents using a caller-supplied JSON Schema. Supports Deep Extract for harder documents, Array Extract for repeating sections, and Citations that ...

Reducto Split API

Automatically separate multi-document files and long forms into individual logical units using rules-based Split or Deep Split, then route each unit to downstream Parse, Extract...

Reducto Edit API

Fill detected blanks, tables, and checkboxes inside documents from a provided form schema, without requiring per-document templates. Beta endpoint priced at 4 credits per page.

Reducto Pipeline API

Compose Parse, Split, Extract, Edit, and Classify into a single multi-step workflow with chained outputs. Supports priority requests on Growth, and on-premise / VPC deployments ...

Reducto Classify API

Classify documents into a defined set of categories and run citation lookups against parsed content. Billed at 0.5 credits per page of context (default 5 pages = 2.5 credits per...

Reducto Jobs API

Retrieve, cancel, and list async jobs created by parse_async, extract_async, split_async, edit_async, and pipeline_async. Pairs with direct or Svix-backed webhooks for completio...

Reducto Upload API

Upload files directly to Reducto storage and receive a reducto://upload reference usable across Parse, Split, Extract, Edit, Pipeline, and Classify. Includes large-file (chunked...

Reducto Webhooks API

Configure webhook endpoints for asynchronous job completion. Supports direct webhooks and Svix-backed delivery, plus a hosted Webhook Portal for end-customer subscription manage...

Reducto Platform API

Platform health, version, and metrics endpoints for operating and monitoring Reducto, including Prometheus and streaq metrics exposed by on-premise deployments.

Capabilities

Run Capabilities with Naftiko — Deploy and orchestrate these API capabilities using Naftiko Fleet.

Run with Naftiko

Reducto Classify API — Classify

Reducto Classify API — Classify. 2 operations covering classification and citation lookup. Sorts documents into a defined set of categories and surfaces citations against parsed...

Run with Naftiko

Reducto Edit API — Edit

Reducto Edit API — Edit. 2 operations covering synchronous and asynchronous edit. Fills detected blanks, tables, and checkboxes inside documents from a provided form schema with...

Run with Naftiko

Reducto Extract API — Extract

Reducto Extract API — Extract. 2 operations covering synchronous and asynchronous extract. Self-contained Naftiko capability for the Reducto Extract business surface, supporting...

Run with Naftiko

Reducto Jobs API — Jobs

Reducto Jobs API — Jobs. 3 operations covering retrieve, cancel, and list async jobs created by parse_async, extract_async, split_async, edit_async, and pipeline_async.

Run with Naftiko

Reducto Parse API — Parse

Reducto Parse API — Parse. 2 operations covering synchronous and asynchronous parse. Self-contained Naftiko capability for the Reducto Parse business surface, exposing layout, s...

Run with Naftiko

Reducto Pipeline API — Pipeline

Reducto Pipeline API — Pipeline. 2 operations covering synchronous and asynchronous pipeline execution. Composes Parse, Split, Extract, and Edit into a single multi-step workflo...

Run with Naftiko

Reducto Split API — Split

Reducto Split API — Split. 2 operations covering synchronous and asynchronous split. Splits multi-document files and long forms into individual logical units using rules-based o...

Run with Naftiko

Reducto Upload API — Upload

Reducto Upload API — Upload. 1 operation. Uploads files directly to Reducto storage and returns a reducto://upload reference usable across Parse, Split, Extract, Edit, Pipeline,...

Run with Naftiko

Reducto Webhooks API — Webhooks

Reducto Webhooks API — Webhooks. 1 operation. Configures webhook endpoints for asynchronous job completion, including direct webhooks and Svix-backed delivery.

Run with Naftiko

Run Capabilities with Naftiko — Deploy and orchestrate these API capabilities using Naftiko Fleet.

Run with Naftiko

Features

Parse — agentic OCR with error correction, layout-aware extraction across 30+ file types (PDFs, images, spreadsheets, slides, Office docs, text)

Extract — schema-driven structured data extraction with Deep Extract, Array Extract, and Citations that pin fields to source page + bounding box

Split — automatic separation of multi-document files and long forms via rules-based Split or Deep Split

Edit — template-free form filling for blanks, tables, and checkboxes from a form schema (beta)

Classify — page-context document classification with optional citations

Pipeline — single-call composition of Parse + Split + Extract + Edit + Classify with chained outputs

Cite — citation lookup endpoint surfacing source-level references inside parsed content

Async endpoints (parse_async, extract_async, split_async, edit_async, pipeline_async) plus a Jobs API for retrieve, cancel, and list

Direct webhooks and Svix-backed webhooks for async completion, with a hosted Webhook Portal

Upload API with large-file chunked uploads producing reducto:// references reusable across endpoints

jobid:// references that let Extract / Split / Edit reuse a prior Parse without re-billing

Multilingual parsing across 100+ languages with automatic page rotation

Intelligent chunking (variable, section, page, block) optimized for LLM/embedding pipelines

Figure summarization, chart extraction, equation handling, list detection, and discardable-block tagging

Spreadsheet parsing with table splitting, cell colors, formulas, and clustering modes

Table output formats — HTML, JSON, Markdown, CSV, and AI-JSON

Studio — visual workbench for Parse, Split, Extract, Edit, and Pipeline deployment

Studio Deploy Pipeline — push designed workflows directly to production

Official SDKs in Python, Node.js, and Go plus a Reducto CLI

Reducto MCP Server for agent integration

LLMs Center (llms.reducto.ai) and llms.txt for AI-agent friendly documentation

Hybrid VPC and on-premise deployment options (AWS, Azure, GCS, Box) with database, OCR, LLM, fair-queueing, observability, and file-cleanup configuration

EU data residency endpoints on Growth+

Zero-data-retention option and Business Associate Agreement on Growth+

Token-style rate limits — 200 concurrent sync requests, 500 RPS submission; tiered per-second sync rate (1 RPS Standard, 10 RPS Growth, 100+ RPS Enterprise)

Credit-based pricing — Parse 1-4 credits/page, Extract 2 credits/page (Deep Extract 4 + 0.1/field, min 30), Split 2-4 credits/page, Edit 4 credits/page, Classify 0.5 credits/page-of-context

15,000 free credits on the Standard plan, then $0.015 per credit

Use Cases

Financial document analysis

Parse 10-Ks, prospectuses, KBYC packets, broker statements, and credit memos to extract tables, line items, and structured financials for downstream analytics or LLM agents.

Insurance claims and underwriting

Split multi-document claim packets, classify each unit (police report, medical record, photo, ACORD form), and extract structured fields with citations for adjuster review.

Healthcare records processing

Extract structured patient, encounter, lab, and medication data from scanned EOBs, charts, and faxes under a BAA with zero data retention.

Legal contract review

Surface redlined clauses, defined terms, and obligation language from redlined contracts and case files with field-level citations back to the source page.

Government and public-sector forms

Fill, extract, and classify long-form government applications, permits, and disclosure filings using the Edit and Pipeline APIs.

Logistics, supply chain, and trade

Parse invoices, BOLs, customs forms, certificates of origin, and packing lists in bulk to feed ERP and TMS systems.

Identity verification

Extract and validate fields from passports, IDs, and proof-of-address documents using the Identity Verification cookbook.

Invoice and AP automation

Capture vendor, line-item, and tax data from invoices with citations to source bounding boxes for review and approval.

Multilingual document processing

Parse and extract from documents in 100+ languages with consistent schemas.

Multimodal RAG ingestion

Produce LLM-optimized chunks (with figure summarization, embed strings, and bounding boxes) ready for vector indexing.

Integrations

AWS S3 / Hybrid VPC on AWS

Presigned S3 URLs as parse inputs, plus hybrid VPC deployment on AWS for on-prem-grade isolation.

Azure / Hybrid VPC on Azure

Hybrid VPC deployment on Azure for regulated workloads.

Google Cloud / Hybrid VPC on GCS

Hybrid VPC deployment on GCS for regulated workloads.

Box / Hybrid VPC on Box

Pull documents from Box for parsing in a hybrid VPC topology.

Browserbase

Web-browsing cookbook that pairs Browserbase with Reducto for live web-document capture.

Svix

Webhook delivery and management via Svix-backed webhook portals.

Model Context Protocol (MCP)

Reducto MCP Server exposes Parse / Extract / Split / Edit / Classify as MCP tools to AI agents.

OpenAPI tooling

Public OpenAPI 3.1 spec at docs.reducto.ai/openapi.json plus a legacy spec for backwards compatibility.

Solutions

Standard

Pay-as-you-go credit plan, 15,000 free credits, then $0.015/credit; 1 RPS sync; up to 5 Studio seats.

Growth

Volume-discounted plan adding zero-data-retention, BAA, 10 RPS sync, up to 5 active priority requests, EU data residency, priority support, and unlimited Studio seats.

Enterprise

Adds VPC and on-premises deployment, custom MSA/SLA, dedicated support, RBAC, SSO/SAML, and 100+ RPS custom throughput.

reducto-ai

Sources