Command-line interface for Firecrawl. Scrape, crawl, and extract data from any website directly from your terminal.
Device detection on-premise services for the 51Degrees Pipeline API
Local MCP for web search, direct-URL fetch, compact normalized output, site crawling, and docs export.
AI-powered web crawler — self-hosted Firecrawl alternative. Crawl, extract, render JS, search. CLI + MCP + REST API + Dashboard.
MCP server for crawling and indexing web documentation - works with any website
CloudFlare Radar verified bots directory - 500+ web crawlers, search engine bots, and user agents as JSON for bot detection and traffic filtering
Node.js wrapper for Obscura — the Rust-powered headless browser for AI agents and web scraping. Drop-in replacement for headless Chrome with Puppeteer and Playwright. Ultra-lightweight (~30 MB), instant startup, built-in stealth mode.
Curated, sourced list of AI crawler / training bot user agents, plus a small CLI to test whether a URL is reachable to each bot.
Lightweight cloud SDK for Crawl4AI - mirrors the OSS API
Native-speed, browserless web crawler + MCP server for AI agents. A standalone Rust engine (turbo-dom + a V8 JS-render tier) — fetch + parse + extract + run page JS with no headless browser. This npm package is a thin launcher that spawns the native binar
Crawl a website and emit a valid sitemap.xml — for sites that ship without one or whose CMS pipeline forgot to.
A crawler and actionable audit utility for SEO and GEO website analysis.
Official TypeScript/JavaScript SDK for the Synoppy web-data API — scrape, screenshot, crawl, map, extract, classify, enrich, and images on one key.
MCP server for the Synoppy web-data API — read, screenshot, crawl, map, search, extract, classify, enrich, and images as tools for any MCP client.
CLI tool for scoping documentation sites for migration to Mintlify with support for repeated --scope-prefix filters
小红书创作者 CLI(fork: 配置目录统一到 ~/.config/xhs-cli,与 xianyu-auto skill 约定一致)
An event-driven web scraping library with polling functionality built on top of Puppeteer.
deepcrawl cli