Crawlee logo

Crawlee

Crawlee is an open-source web scraping and crawling library maintained by Apify, providing a unified set of crawler classes, request queues, datasets, and key-value stores for building reliable scrapers. It is available for both JavaScript/TypeScript (Node.js) and Python, offering HTTP, Cheerio, JSDOM, LinkeDOM, Puppeteer, Playwright, and Stagehand crawler implementations along with proxy and session management utilities for production-grade scraping.

2 APIs 0 Features
Apache 2.0ApifyBrowser AutomationCrawlersHarvestingJavaScriptNode.jsOpen SourcePlaywrightPuppeteerPythonScrapingWeb

APIs

Crawlee JavaScript SDK

The Crawlee JavaScript SDK is a Node.js/TypeScript library for building reliable web scrapers and crawlers. It provides a family of crawler classes - BasicCrawler, HttpCrawler, ...

Crawlee Python SDK

The Crawlee Python SDK is a Python library for building reliable web scrapers and crawlers. It offers BasicCrawler, HttpCrawler, BeautifulSoupCrawler, ParselCrawler, PlaywrightC...

Resources

🔗
Website
Website
🔗
Documentation
Documentation
👥
GitHubOrganization
GitHubOrganization
👥
GitHubRepository
GitHubRepository
📰
Blog
Blog
📄
ChangeLog
ChangeLog
🔗
Discord
Discord
🔗
Community
Community
🔗
License
License
🔗
Apify
Apify

Sources

apis.yml Raw ↑
aid: crawlee
name: Crawlee
x-type: opensource
description: >-
  Crawlee is an open-source web scraping and crawling library maintained by
  Apify, providing a unified set of crawler classes, request queues,
  datasets, and key-value stores for building reliable scrapers. It is
  available for both JavaScript/TypeScript (Node.js) and Python, offering
  HTTP, Cheerio, JSDOM, LinkeDOM, Puppeteer, Playwright, and Stagehand
  crawler implementations along with proxy and session management
  utilities for production-grade scraping.
url: https://raw.githubusercontent.com/api-evangelist/crawlee/refs/heads/main/apis.yml
image: https://kinlane-productions2.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
  - Apache 2.0
  - Apify
  - Browser Automation
  - Crawlers
  - Harvesting
  - JavaScript
  - Node.js
  - Open Source
  - Playwright
  - Puppeteer
  - Python
  - Scraping
  - Web
created: '2025-02-08'
modified: '2026-04-28'
specificationVersion: '0.20'
type: Index
access: Public
position: Provider
apis:
  - aid: crawlee:crawlee-javascript-sdk
    name: Crawlee JavaScript SDK
    description: >-
      The Crawlee JavaScript SDK is a Node.js/TypeScript library for building
      reliable web scrapers and crawlers. It provides a family of crawler
      classes - BasicCrawler, HttpCrawler, CheerioCrawler, JSDOMCrawler,
      LinkeDOMCrawler, PuppeteerCrawler, PlaywrightCrawler, and
      AdaptivePlaywrightCrawler - along with shared infrastructure for
      AutoscaledPool resource management, proxy rotation, session pooling,
      RequestQueue task queuing, Dataset result storage, and KeyValueStore
      unstructured data persistence. Crawlee handles retries, error
      recovery, request fingerprinting, and statistics tracking out of the
      box, allowing developers to focus on extraction logic.
    humanURL: https://crawlee.dev/js
    properties:
      - type: Documentation
        url: https://crawlee.dev/js
      - type: Reference
        url: https://crawlee.dev/js/api
      - type: GettingStarted
        url: https://crawlee.dev/js/docs/quick-start
      - type: GitHubRepository
        url: https://github.com/apify/crawlee
      - type: NpmPackage
        url: https://www.npmjs.com/package/crawlee
    tags:
      - Browser Automation
      - Cheerio
      - JavaScript
      - Node.js
      - Playwright
      - Puppeteer
      - Scraping
      - TypeScript
  - aid: crawlee:crawlee-python-sdk
    name: Crawlee Python SDK
    description: >-
      The Crawlee Python SDK is a Python library for building reliable web
      scrapers and crawlers. It offers BasicCrawler, HttpCrawler,
      BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, and Adaptive
      crawlers built on top of asyncio, along with shared infrastructure
      for proxy rotation, session pooling, RequestQueue, Dataset, and
      KeyValueStore. The Python SDK targets data engineers and Python
      developers who want the same crawler ergonomics as the JavaScript
      version but inside the Python ecosystem.
    humanURL: https://crawlee.dev/python
    properties:
      - type: Documentation
        url: https://crawlee.dev/python
      - type: Reference
        url: https://crawlee.dev/python/api
      - type: GettingStarted
        url: https://crawlee.dev/python/docs/quick-start
      - type: GitHubRepository
        url: https://github.com/apify/crawlee-python
      - type: PyPiPackage
        url: https://pypi.org/project/crawlee/
    tags:
      - Asyncio
      - BeautifulSoup
      - Browser Automation
      - Parsel
      - Playwright
      - Python
      - Scraping
common:
  - type: Website
    url: https://crawlee.dev/
  - type: Documentation
    url: https://crawlee.dev/
  - type: GitHubOrganization
    url: https://github.com/apify
  - type: GitHubRepository
    url: https://github.com/apify/crawlee
  - type: Blog
    url: https://crawlee.dev/blog
  - type: ChangeLog
    url: https://github.com/apify/crawlee/releases
  - type: Discord
    url: https://discord.gg/jyEM2PRvMU
  - type: Community
    url: https://crawlee.dev/discord
  - type: License
    url: https://github.com/apify/crawlee/blob/master/LICENSE.md
  - type: Apify
    url: https://apify.com/
maintainers:
  - FN: Kin Lane
    email: [email protected]