npm.io
0.1.2 • Published 5d ago

@ariedotme/pi-firecrawl-web

Licence
MIT
Version
0.1.2
Deps
2
Size
23 kB
Vulns
0
Weekly
383

pi-firecrawl-web

A small Pi TypeScript extension that registers LLM-callable web tools powered only by Firecrawl:

  • web_search — search the web with Firecrawl Search.
  • fetch_content — scrape and clean one URL with Firecrawl Scrape.
  • extract_structured — extract structured JSON from one or more URLs with Firecrawl Extract.

It validates URL protocols, avoids logging secrets, caps search defaults for agent-friendly credit usage, and returns human-readable text plus structured details.

Install locally

npm install
pi -e ./index.ts

Install as a Pi package

If published to npm:

pi install npm:pi-firecrawl-web

Configure Firecrawl

Firecrawl Cloud

For Firecrawl Cloud, leave FIRECRAWL_BASE_URL unset and provide an API key:

export FIRECRAWL_API_KEY=***
Firecrawl self-hosted without authentication

Point the extension at your server and omit the key:

export FIRECRAWL_BASE_URL=http://localhost:3002
unset FIRECRAWL_API_KEY

No Authorization header is sent when FIRECRAWL_API_KEY is absent.

Firecrawl self-hosted with authentication

Set both variables:

export FIRECRAWL_BASE_URL=https://firecrawl.example.com
export FIRECRAWL_API_KEY=***

When the key is present, the SDK sends Authorization: Bearer <key>.

You may also configure the extension directly:

{
  "firecrawlBaseUrl": "https://firecrawl.example.com",
  "firecrawlApiKey": "fc-..."
}

The extension never hardcodes or prints the API key. API keys are optional for self-hosted Firecrawl deployments that do not require authentication.

Slash command

/firecrawl-status

Shows the base URL, whether authentication is configured, and which tools were registered, without revealing the key.

Usage examples

web_search({ query: "latest React compiler docs", limit: 5 })

web_search({ query: "OpenAI reasoning models", includeContent: true })

fetch_content({ url: "https://docs.firecrawl.dev", formats: ["markdown"] })

extract_structured({
  urls: ["https://example.com"],
  prompt: "Extract pricing plans",
  schema: {
    type: "object",
    properties: {
      plans: {
        type: "array",
        items: {
          type: "object",
          properties: {
            name: { type: "string" },
            price: { type: "string" },
            features: { type: "array", items: { type: "string" } }
          }
        }
      }
    }
  }
})

Tool notes

Parameters include query, limit (default 5, max 20), sources, categories, domain filters, country/location, and optional scraped content (markdown or summary). Results are normalized as query, web, news, and images in details.normalized.

fetch_content

Supports markdown, summary, html, rawHtml, links, images, screenshot, json, question, and highlights. Visible output is truncated when large, but json output is returned compactly without cutting through the JSON text.

extract_structured

Requires either prompt or schema. Supports multiple HTTP/HTTPS URLs, optional web search, sources, safe scrape options, and wildcard warnings because /* can consume many credits.

Development

npm install
npm run typecheck
npm test

Manual local load:

pi -e ./index.ts

If FIRECRAWL_API_KEY is unavailable, run the mock tests; they validate URL/domain guards, response formatting, and friendly missing-key behavior without calling Firecrawl.

Keywords