Crawlee

Crawlee is an open-source web scraping and crawling library maintained by Apify, providing a unified set of crawler classes, request queues, datasets, and key-value stores for building reliable scrapers. It is available for both JavaScript/TypeScript (Node.js) and Python, offering HTTP, Cheerio, JSDOM, LinkeDOM, Puppeteer, Playwright, and Stagehand crawler implementations along with proxy and session management utilities for production-grade scraping.

2 APIs 0 Features

Apache 2.0ApifyBrowser AutomationCrawlersHarvestingJavaScriptNode.jsOpen SourcePlaywrightPuppeteerPythonScrapingWeb

GitHubOrganization

Sources

aid: crawlee
name: Crawlee
kind: opensource
description: >-
  Crawlee is an open-source web scraping and crawling library maintained by
  Apify, providing a unified set of crawler classes, request queues,
  datasets, and key-value stores for building reliable scrapers. It is
  available for both JavaScript/TypeScript (Node.js) and Python, offering
  HTTP, Cheerio, JSDOM, LinkeDOM, Puppeteer, Playwright, and Stagehand
  crawler implementations along with proxy and session management
  utilities for production-grade scraping.
url: https://raw.githubusercontent.com/api-evangelist/crawlee/refs/heads/main/apis.yml
image: https://kinlane-productions2.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
  - Apache 2.0
  - Apify
  - Browser Automation
  - Crawlers
  - Harvesting
  - JavaScript
  - Node.js
  - Open Source
  - Playwright
  - Puppeteer
  - Python
  - Scraping
  - Web
created: '2025-02-08'
modified: '2026-04-28'
specificationVersion: '0.20'
type: Index
access: Public
position: Provider
apis:
  - aid: crawlee:crawlee-javascript-sdk
    name: Crawlee JavaScript SDK
    description: >-
      The Crawlee JavaScript SDK is a Node.js/TypeScript library for building
      reliable web scrapers and crawlers. It provides a family of crawler
      classes - BasicCrawler, HttpCrawler, CheerioCrawler, JSDOMCrawler,
      LinkeDOMCrawler, PuppeteerCrawler, PlaywrightCrawler, and
      AdaptivePlaywrightCrawler - along with shared infrastructure for
      AutoscaledPool resource management, proxy rotation, session pooling,
      RequestQueue task queuing, Dataset result storage, and KeyValueStore
      unstructured data persistence. Crawlee handles retries, error
      recovery, request fingerprinting, and statistics tracking out of the
      box, allowing developers to focus on extraction logic.
    humanURL: https://crawlee.dev/js
    properties:
      - type: Documentation
        url: https://crawlee.dev/js
      - type: Reference
        url: https://crawlee.dev/js/api
      - type: GettingStarted
        url: https://crawlee.dev/js/docs/quick-start
      - type: GitHubRepository
        url: https://github.com/apify/crawlee
      - type: NpmPackage
        url: https://www.npmjs.com/package/crawlee
    tags:
      - Browser Automation
      - Cheerio
      - JavaScript
      - Node.js
      - Playwright
      - Puppeteer
      - Scraping
      - TypeScript
  - aid: crawlee:crawlee-python-sdk
    name: Crawlee Python SDK
    description: >-
      The Crawlee Python SDK is a Python library for building reliable web
      scrapers and crawlers. It offers BasicCrawler, HttpCrawler,
      BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, and Adaptive
      crawlers built on top of asyncio, along with shared infrastructure
      for proxy rotation, session pooling, RequestQueue, Dataset, and
      KeyValueStore. The Python SDK targets data engineers and Python
      developers who want the same crawler ergonomics as the JavaScript
      version but inside the Python ecosystem.
    humanURL: https://crawlee.dev/python
    properties:
      - type: Documentation
        url: https://crawlee.dev/python
      - type: Reference
        url: https://crawlee.dev/python/api
      - type: GettingStarted
        url: https://crawlee.dev/python/docs/quick-start
      - type: GitHubRepository
        url: https://github.com/apify/crawlee-python
      - type: PyPiPackage
        url: https://pypi.org/project/crawlee/
    tags:
      - Asyncio
      - BeautifulSoup
      - Browser Automation
      - Parsel
      - Playwright
      - Python
      - Scraping
common:
  - type: LinkedIn
    url: https://www.linkedin.com/company/apify
  - type: Website
    url: https://crawlee.dev/
  - type: Documentation
    url: https://crawlee.dev/
  - type: GitHubOrganization
    url: https://github.com/apify
  - type: GitHubRepository
    url: https://github.com/apify/crawlee
  - type: Blog
    url: https://crawlee.dev/blog
  - type: ChangeLog
    url: https://github.com/apify/crawlee/releases
  - type: Discord
    url: https://discord.gg/jyEM2PRvMU
  - type: Community
    url: https://crawlee.dev/discord
  - type: License
    url: https://github.com/apify/crawlee/blob/master/LICENSE.md
  - type: Apify
    url: https://apify.com/
maintainers:
  - FN: Kin Lane
    email: [email protected]

Crawlee

APIs

Crawlee JavaScript SDK

Crawlee Python SDK

Resources

Sources