npm.io
0.1.25 • Published yesterdayCLI

browser4-cli

Licence
Apache-2.0
Version
0.1.25
Deps
0
Size
74.3 MB
Vulns
0
Weekly
1.1K
Install scriptsThis package runs scripts during installation (preinstall/install/postinstall)

Browser4

Make websites accessible for AI agents. Automate tasks online with ease.

Installation

Installs the native Rust binary:

npm install -g browser4-cli

After installation, use browser4-cli. The shorter browser4 command remains available as a compatibility alias.

Project Installation (local dependency)

For projects that want to pin the version in package.json:

npm install browser4-cli

Then use via package.json scripts or by invoking browser4-cli directly.

Standalone Installer Scripts (no npm needed)

Bootstrap the native binary directly with a single command:

Windows (PowerShell):

Invoke-WebRequest -Uri "https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.ps1" -OutFile "$env:TEMP\install-browser4-cli.ps1"
powershell -ExecutionPolicy Bypass -File "$env:TEMP\install-browser4-cli.ps1"

Linux / macOS (bash):

curl -fsSL https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.sh | bash

Or run the scripts locally from a cloned repo:

  • cli/scripts/install-browser4-cli.ps1 (Windows)
  • cli/scripts/install-browser4-cli.sh (Linux / macOS / Git Bash)

See Standalone CLI Installer Scripts for the full option reference, supported platforms, and examples.

From Source

Build prerequisites: Rust (stable, edition 2021), Node.js 24+, pnpm 10+, git.

git clone https://github.com/platonai/Browser4.git
cd Browser4/cli/browser4-cli
pnpm install
pnpm build:native   # Compiles the Rust binary (requires https://rustup.rs)
pnpm link --global  # Makes browser4-cli available globally

Cross-compilation from Linux (for release builds) additionally requires: cargo-zigbuild, Zig 0.13.0, gcc-aarch64-linux-gnu, and mingw-w64. See cli/docker/Dockerfile.build for a Dockerized build environment.

Requirements
  • Chrome — Latest Chrome installed on your system. The CLI can auto-install Chrome on most platforms when missing.
  • Java 17+ — Required to run the Browser4 backend (Browser4.jar). Eclipse Temurin recommended. JDK 21+ enables best jlink compression when the CLI auto-builds a runtime bundle from source.
  • Rust — Only needed when building the CLI from source (see From Source above). The stable toolchain (edition 2021) is sufficient.
Additional requirements for auto-building a runtime bundle from source

When the CLI detects a Browser4 repository checkout, it attempts to build a self-contained runtime bundle (bundled JRE + dependency JARs) from source instead of downloading a pre-built release. This requires:

Tool Version Linux macOS Windows
Maven 3.9+ via mvnw wrapper via mvnw wrapper via mvnw.cmd wrapper
JDK tools (jdeps, jlink) bundled with JDK 16+ included in JDK included in JDK included in JDK
PowerShell 7 (pwsh) 7.0+ required required built-in (powershell.exe)
tar any required required built-in

Set BROWSER4_CLI_FORCE_REMOTE_BUNDLE=1 to skip the local build and always download a pre-built bundle — useful in CI / corporate environments where Maven or jlink are unavailable.

Usage

browser4-cli <command> [args] [options]
browser4-cli -s=<session> <command> [args] [options]
Global options
Flag Description
-h, --help [command] Print help (optionally for a specific command)
-v, --version Print version
-s=<name> Named session label
--server=<url> Override Browser4 server URL
--proxy=<url> Manual HTTP proxy override for downloads
--json Emit machine-parseable JSON to stdout
-q, --quiet Suppress normal output, only show errors

--json switches every command's stdout from human-readable text to a single-line JSON envelope ({"status":"ok","command":"<name>","output":{...}}). Omit --json for the default human-readable output.

--proxy=<url> sets a manual HTTP proxy used only for downloading the Browser4 runtime bundle. It does not affect browser traffic.

-q / --quiet suppresses all normal stdout output. Errors and progress messages still go to stderr. Combine with --json for silent-on-success scripting: browser4-cli --json -q open.

Sessions are persisted independently per name. Omitting -s uses the default session (~/.browser4/cli-state.json). With -s=<name>, a separate state file is stored under ~/.browser4/sessions/<name>.json. open without -s reuses the default session if one exists; with -s=<name> it switches to or creates the named session.

Commands

The tables below mirror the commands surfaced by the global browser4-cli help overview.

Core
Command Description
open [url] Open or switch to a browser session. Supports --headed (force visible window), --headless (force headless), --profile=<path>, --profile-mode=<mode> (temporary / sequential / default), and --interact-level=<level> (FASTEST / FAST / DEFAULT).
close Close the active session
goto <url> Navigate to a URL, auto-opening or refreshing the session if needed
click <ref> [button] Click an element. Supports --modifiers (modifier keys to press).
dblclick <ref> [button] Double-click an element. Supports --modifiers (modifier keys to press).
type <text> [ref] Type text into the focused element or an optional target element. Supports --submit (press Enter after typing).
fill <ref> <text> Fill text into an editable element. Supports --submit (press Enter after filling).
hover <ref> Hover over an element
select <ref> <val> Select an option in a dropdown
upload <ref> <file> Upload a file
check <ref> Check a checkbox or radio button
uncheck <ref> Uncheck a checkbox or radio button
drag <startRef> <endRef> Drag and drop between two elements
snapshot Capture accessibility snapshot. Supports --filename=<path> to save to file, --boxes for bounding boxes, -i/--interactive for interactive-only, -u/--urls for link URLs, -c/--compact for compact output, -d/--depth <n> for depth limiting, -s/--selector <sel> for CSS scoping, and --raw for raw output without page info.
eval <expression> [ref] Evaluate JavaScript on the page or a target element. Use --file=<path> to read the expression from a file.
get <mode> <selector> [name] Extract data from a page element. Modes: text, html, box, styles, property, attr. name is required for property and attr.
scroll <direction> <pixels> Scroll the page. Direction: up, down, left, or right.
wait [target] Wait for a condition. Positional: selector or milliseconds. Options: --text=<text>, --url=<glob>, --load=<state> (networkidle/domcontentloaded), --fn=<JS expr>.
dialog-accept [prompt] Accept a dialog
dialog-dismiss Dismiss a dialog
resize <w> <h> Resize the browser window
delete-data Delete session data
Navigation
Command Description
go-back Go back to the previous page
go-forward Go forward to the next page
reload Reload the current page
Keyboard
Command Description
press <key> [ref] Press a key on the focused element or an optional target element
keydown <key> Press and hold a key
keyup <key> Release a key
Mouse
Command Description
mousemove <x> <y> Move mouse to coordinates
mousedown [button] Press mouse button
mouseup [button] Release mouse button
mousewheel <dx> <dy> Scroll the mouse wheel
Screenshots
Command Description
screenshot [ref] Take a screenshot (optionally of a specific element). Supports --filename=<path> and --full-page (full scrollable page).
pdf Save the current page as a PDF. Supports --filename=<path>.
Tabs
Command Description
tab-list List all tabs with their zero-based index
tab-new [url] Create a new tab
tab-close [index] Close a tab by its zero-based index
tab-select <index> Select a tab by its zero-based index

Run tab-list first to find each tab's zero-based index. Pass that index to tab-select or tab-close.

Browser storage
Command Description
state-save <path> Save cookies and localStorage to a JSON file
state-load <path> Restore cookies and localStorage from a saved state file
cookie-list List all cookies (optionally filtered by --domain / --path)
cookie-get <name> Get a cookie by name
cookie-set <name> <value> Set a cookie. Supports --path, --domain, --expires, --httpOnly, --secure, --sameSite (Strict / Lax / None).
cookie-delete <name> Delete a cookie by name
cookie-clear Clear all cookies for the current page
localstorage-list List all localStorage entries
localstorage-get <key> Get a localStorage value by key
localstorage-set <key> <value> Set a localStorage key-value pair
localstorage-delete <key> Delete a localStorage key
localstorage-clear Clear all localStorage entries
sessionstorage-list List all sessionStorage entries
sessionstorage-get <key> Get a sessionStorage value by key
sessionstorage-set <key> <value> Set a sessionStorage key-value pair
sessionstorage-delete <key> Delete a sessionStorage key
sessionstorage-clear Clear all sessionStorage entries
Browser sessions
Command Description
list List browser sessions. Supports --all (list sessions across all workspaces).
close-all Close all browser sessions without stopping Browser4.jar / the Browser4 backend
kill-all Forcefully stop Browser4.jar / the Browser4 backend and kill Browser4 browser processes

Use close-all for session cleanup when you want to keep the current Browser4 service running. Use kill-all only when you explicitly want to stop the backend and clean up tracked Browser4 processes.

Server management
Command Description
install Download the Browser4 runtime bundle. Supports --tag=<version> to pin a release and --force to reinstall even when already present.
upgrade Upgrade the Browser4 runtime bundle to the latest version or a specified --tag
uninstall Remove globally installed browser4-cli (npm) and its runtime data. Supports -y / --yes (skip confirmation) and --dry-run (preview).
stop Kill the Browser4 backend after closing all sessions
status Check whether the Browser4 backend is reachable and healthy

install and upgrade both manage the Browser4 runtime bundle — a self-contained distribution that includes all dependency jars, a minimal jlink-built JRE, and platform launcher scripts. Neither requires cargo or a Rust toolchain; the runtime is a Java application downloaded from GitHub Releases.

Use --tag=<version> to pin a specific release (e.g. --tag=v4.9.3). Use --force to reinstall even when the same version is already present.

When a local Browser4 checkout is detected with the browser4-bundle module present, install and upgrade auto-build the runtime bundle from source (via Maven) instead of downloading.

uninstall attempts to remove browser4-cli from npm global packages and deletes the Browser4 runtime data and cache directories. It prompts for confirmation unless -y / --yes is passed. Use --dry-run to preview what would be removed without making changes. Does not require a running server.

browser4-cli uninstall
browser4-cli uninstall -y
browser4-cli uninstall --dry-run
Advanced commands

These commands are intentionally omitted from the global browser4-cli help overview. Query browser4-cli help <command> for the exact syntax when you need them.

Command Description
batch [command...] Execute multiple commands in one invocation. Only DOM operations are supported (Core, Navigation, Keyboard, Mouse, Export, Tabs categories). Commands like open, close, list, agent run, etc. are not allowed in batch mode.
console [min-level] List console messages
extract <instruction> Extract structured data from the current page. Supports --schema (JSON Schema for typed output).
summarize [instruction] Summarize page content using AI
agent run <task> Run an autonomous agent task
agent status <id> Check the status of a running agent task
agent result <id> Get the result of a completed agent task
swarm create Create a swarm scrape session with parallel browser contexts
swarm submit [url] Submit URL(s) or raw X-SQL payloads as scrape jobs
swarm query <url> Run an X-SQL query against a loaded webpage
swarm status <id> Check the status of a scrape or query job
swarm result <id> Get the result of a completed job
crawl <url> Crawl a website from a seed URL, following links up to a configurable depth

Agent task workflow (agent <subcommand>)

The agent-* commands wrap the backend command agent's asynchronous task API. They are useful when you want Browser4 to plan and execute a natural-language task in the background instead of issuing one low-level browser action at a time.

Like other advanced commands, they are intentionally omitted from the global browser4-cli help overview. Query browser4-cli help agent run (or agent status / agent result) when you need the exact syntax.

Use the spaced agent <subcommand> form:

browser4-cli agent run "Open browser4.io and summarize the hero section"
browser4-cli agent status agent-task-1
browser4-cli agent result agent-task-1
Command lifecycle
Step Command What it does
1 agent run <task> Submits an asynchronous natural-language task through command_run and prints the returned task ID
2 agent status <id> Fetches the latest task status payload through command_status
3 agent result <id> Fetches the completed task result payload through command_result
Notes
  • agent run is asynchronous: it returns immediately after the backend accepts the task and prints a follow-up agent status command with the generated task ID.
  • agent status prints the backend status payload as-is. In practice this is a JSON object that commonly includes fields such as id, status, statusCode, processState, message, agentState, agentHistory, and commandResult.
  • agent result prints the backend result payload as-is. Depending on the task, it may be plain text or structured JSON.
  • These commands are task-ID based and do not require an active CLI browser session slot. The global -s=<name> option is therefore usually not relevant for agent-* follow-up calls.
  • agent subcommands are not supported inside batch mode.
  • agent run performs a short post-submit status probe so obvious missing-LLM configuration failures can be surfaced immediately instead of leaving you with a task ID that will never succeed.
Use cases
1. Submit an autonomous agent task
browser4-cli agent run "Open browser4.io and summarize the hero section"

Typical output:

Task submitted: agent-task-1
Use 'browser4-cli agent status agent-task-1' to check progress.
2. Poll task progress
browser4-cli agent status agent-task-1

Example status payload:

{"id":"agent-task-1","status":"RUNNING"}

On a real Browser4 backend the payload can be richer and may include lifecycle details such as processState, agent history snapshots, or an embedded partial commandResult.

3. Read the final result
browser4-cli agent result agent-task-1

If the backend returns a structured CommandResult, expect fields such as summary, pageSummary, fields, links, or xsqlResultSet.

Swarm scrape workflow (swarm <subcommand>)

The swarm subcommands support a swarm scrape workflow where one CLI session coordinates multiple browser contexts in the Browser4 backend.

Command overview
Command Purpose Backend endpoint
swarm create Create a swarm scrape session POST /api/swarm
swarm submit <url> Scrape URLs or submit raw X-SQL POST /api/swarm/submit
swarm query <url> Run X-SQL queries against loaded pages POST /api/swarm/query
swarm status <id> Poll job status GET /api/swarm/{id}/status
swarm result <id> Fetch completed job result GET /api/swarm/{id}/result
crawl <url> Start a website crawl POST /api/crawl
(poll) Track crawl progress GET /api/crawl/{id}/result
URL scraping with swarm submit
# create a session
browser4-cli swarm create \
  --profile-mode=TEMPORARY \
  --max-open-tabs=12 \
  --max-browser-contexts=3 \
  --display-mode=HEADLESS

# submit URLs as scrape jobs
browser4-cli swarm submit https://example.com/direct \
  --seed-file=./swarm-seeds.txt \
  --deadline=2026-03-30T00:00:00Z \
  --expires=1d \
  --refresh --store-content

# poll and fetch the result
browser4-cli swarm status scrape-task-4
browser4-cli swarm result scrape-task-4
X-SQL queries with swarm query

Run structured X-SQL queries against loaded webpages to extract data.

# Inline query:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql "
  SELECT
    dom_base_uri(dom) AS url,
    dom_first_text(dom, '#productTitle') AS title,
    dom_first_slim_html(dom, 'img:expr(width > 400)') AS img
  FROM load_and_select(@url, 'body');
"

# From a file:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql @query.sql

# With seed file and load options:
browser4-cli swarm query --sql @query.sql --seed-file=./urls.txt --refresh
Notes
  • swarm create accepts backend capability hints: --profile-mode, --max-open-tabs, --max-browser-contexts, --display-mode.
  • swarm submit and swarm query both accept a positional URL, --seed-file, or both. Seed files use one URL per line; # comments and blank lines are ignored.
  • Both commands support load-option flags: --deadline, --expires, --refresh, --parse, --store-content.
  • swarm query --sql is required; swarm submit --sql also works as a convenience. Use @url in the X-SQL template; it is replaced with the target URL server-side.
  • Prefix the --sql value with @ to read from a file (e.g. --sql @query.sql).
  • All commands return a task ID; use swarm status / swarm result to track progress.

Crawl (crawl)

Recursive website crawling from a seed URL.

Command overview
Command Purpose Backend endpoint
crawl <url> Crawl a website from a seed URL POST /api/crawl
(poll) Track crawl progress GET /api/crawl/{id}/result
Basic crawling
# depth=1: extract all links from homepage, load each linked page
browser4-cli crawl "https://example.com" --out-link-selector "a[href]"

# depth=2: follow links two levels deep, filter by pattern
browser4-cli crawl "https://shop.example.com" \
  --depth 2 \
  --out-link-selector "a.product-link" \
  --out-link-pattern "/product/" \
  --top-links 10

# with LoadOptions passthrough
browser4-cli crawl "https://example.com" -ol "a[href]" -a "-refresh -nMaxRetry 5"
Key flags
Flag Default Description
-d, --depth 1 Maximum crawl depth
-ol, --out-link-selector CSS selector to extract links from each page
-olp, --out-link-pattern .+ Regex pattern to filter extracted links
-tl, --top-links 20 Maximum links to extract per page
-a, --args Additional LoadOptions passthrough
--refresh Force a fresh fetch, ignoring cache
--parse Parse each page immediately after fetching
--expires Cache expiration duration
--store-content Persist page content to storage
--page-load-timeout Maximum time to wait for page load
--readonly Non-destructive mode
Notes
  • All LoadOptions flags available via --args passthrough (e.g. -a "-nMaxRetry 5 -lazyFlush").
  • Depth=1 reuses PulsarSession.submitForOutPages; depth>1 uses BFS continuous crawl with visited-URL dedup.
  • CLI timeout: 600s default, configurable via BROWSER4_CLI_CRAWL_TIMEOUT_SECS.
  • Duplicate URLs are skipped (normalized: lowercase, no trailing slash, no query string).

DOM Snapshot commands (domsnapshot <subcommand>)

The domsnapshot commands capture and query a static DOM snapshot of the current page. Unlike the real-time accessibility snapshot, a DOM snapshot is a full HTML capture that can be queried repeatedly with CSS selectors and X-SQL without re-fetching the page.

Use the spaced domsnapshot <subcommand> form:

browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot query --sql "SELECT dom_first_text(dom, '#title') AS title FROM dom(dom)"
browser4-cli domsnapshot export --file=page.html
Command overview
Command Description
domsnapshot Capture a static DOM snapshot of the current page
domsnapshot get <field> [selector] [name] Extract elements from the static DOM snapshot
domsnapshot query [url] --sql=<query> Run X-SQL against the DOM snapshot via the scrape API
domsnapshot export [--file=<path>] Save full snapshot HTML content to a local file
domsnapshot grep [OPTIONS] <pattern> Search snapshot HTML with regex and grep-style output
domsnapshot grep flags

Line numbers are shown by default (unlike GNU grep). Use --no-line-number to suppress them.

Flag Description
-i Case-insensitive matching
-A N, -B N, -C N Context lines after / before / around each match
-v Invert match (select non-matching lines)
-c Print only count of matching lines
-l Print only whether matches exist (prints "domsnapshot" if matches found; use with | grep -q domsnapshot for CI pass/fail)
-F Treat pattern as a literal string
-w Match whole words only
--no-line-number Suppress line numbers in output
--selector <CSS> Scope search to a CSS selector
# Basic search
browser4-cli domsnapshot grep -i error

# With context and literal match
browser4-cli domsnapshot grep -F -C 2 "404 Not Found"

# Count matches
browser4-cli domsnapshot grep -c "div"

# Search within a specific element
browser4-cli domsnapshot grep --selector main "Submit"
domsnapshot get fields
Field Description
text Extract text content from the matched element
html Extract inner HTML from the matched element
attr Extract an attribute value (requires name, e.g. attr "#link" "href")

selector defaults to :root when omitted. name (attribute name) is required for the attr field.

domsnapshot query with X-SQL
# Inline X-SQL:
browser4-cli domsnapshot query "https://example.com" --sql "
  SELECT
    dom_base_uri(dom) AS url,
    dom_first_text(dom, 'h1') AS title
  FROM dom(dom);
"

# From a file:
browser4-cli domsnapshot query --sql @query.sql
  • --sql is required. Prefix with @ to read from a file (e.g. --sql @query.sql).
  • url is optional and defaults to the current session's page URL.
  • These commands do not require an active swarm session.
Notes
  • domsnapshot commands are not supported inside batch mode.
  • Use domsnapshot get for quick element extraction; use domsnapshot query with X-SQL for structured multi-element extraction.
  • The DOM snapshot is cached in the backend until the next domsnapshot capture or session navigation.

Element References

The snapshot command returns an accessibility tree where every interactive node is labeled with a short identifier such as e15. Pass this identifier directly to commands like click, type, or press. You can also use plain CSS selectors (e.g. .my-button, #search-input).

State Persistence

browser4-cli persists CLI state between invocations under ~/.browser4 by default. Override the root directory with the BROWSER4_CLI_STATE_DIR environment variable.

  • Default session state: ~/.browser4/cli-state.json
  • Named session state (-s=<name>): ~/.browser4/sessions/<name>.json

Each state file stores the current Browser4 server URL plus session-scoped fields such as:

  • sessionId — active Browser4 session ID
  • baseUrl — Browser4 backend URL used by the CLI
  • activeSelector — last selector tracked for keyboard restore flows
  • lastMousePosition — last pointer coordinates tracked for mouse restore flows
Session state transitions
Command Behavior
open Creates a new session, or reuses an existing active one. Stale sessions are automatically refreshed.
open -s=<name> Same as open but scoped to a named session slot.
goto <url> Reuses the current session if active; otherwise opens a fresh one before navigating.
close Closes the current session (no-op if none active).
close-all / kill-all / stop Clears all persisted session state.

The list command shows each session's status: Active (backend confirms), Stale (backend has stopped it), or Unknown (backend unreachable).

Runtime Temp Files

browser4-cli keeps ephemeral runtime artifacts under the system temp directory:

  • Windows: %TEMP%\browser4\browser4-cli
  • Linux/macOS: ${TMPDIR:-/tmp}/browser4/browser4-cli

This temp subtree contains items such as:

  • startup logs for auto-started Browser4 servers
  • staged Maven wrapper launchers
  • Rust test scratch directories used by browser4-cli tests

Persistent CLI state remains under ~/.browser4 by default. The Browser4 runtime bundle (JRE, JARs, launchers) is stored separately in a platform-conventional data directory so that clearing CLI session state does not require re-downloading the ~200 MB runtime:

  • Linux: ~/.local/share/browser4/runtime/<version>/
  • macOS: ~/Library/Application Support/browser4/runtime/<version>/
  • Windows: %APPDATA%/browser4/runtime/<version>/

The current.tag file in the runtime/ directory records the active version. Override the runtime data root with the BROWSER4_RUNTIME_DIR environment variable. Downloaded archives are cached under the platform cache directory (~/.cache/browser4/downloads/ on Linux, ~/Library/Caches/browser4/downloads/ on macOS, %LOCALAPPDATA%/browser4/downloads/ on Windows).

Snapshots

After each command that modifies browser state, the CLI automatically:

  1. Retrieves the current page URL and title
  2. Captures an accessibility snapshot
  3. Saves the snapshot to .browser4-cli/snapshot/page-<timestamp>.yml
  4. Prints the snapshot path in Markdown link format

Examples

# Open a new browser window (defaults to headed)
browser4-cli open

# Open in headed or headless mode
browser4-cli open --headed https://browser4.io
browser4-cli open --headless https://browser4.io

# Open with a persistent profile
browser4-cli open --profile=~/.browser4/profiles/work

# Open with a temporary (single-use) profile
browser4-cli open --profile-mode=temporary https://browser4.io

# Open with interact-level control (FASTEST / FAST / DEFAULT)
browser4-cli open --interact-level=FAST https://browser4.io

# Navigate to a page — auto-opens a session if none is active
browser4-cli goto https://browser4.io

# Inspect the page — note the eN labels on interactive nodes
browser4-cli snapshot

# Include element bounding boxes in the snapshot
browser4-cli snapshot --boxes

# Show only interactive elements with compact output, depth-limited
browser4-cli snapshot -i -c -d 5

# Scope snapshot to a specific CSS selector
browser4-cli snapshot -s "#main-content"

# Raw output — no page info, just the snapshot content (useful for piping)
browser4-cli snapshot --raw | grep "button"

# Capture a static DOM snapshot and extract data
browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot get attr "#productTitle" "href"
browser4-cli domsnapshot export --file=page.html

# Interact using refs from the snapshot
browser4-cli click e15
browser4-cli type "Hello World" e15
browser4-cli press Enter e15
browser4-cli eval "document.title"
browser4-cli eval "element => element.textContent" e15
browser4-cli keydown Shift
browser4-cli mousemove 150 300
browser4-cli mousewheel 0 100
browser4-cli keyup Shift

# Take a screenshot and save it to disk
browser4-cli screenshot

# Take a full-page screenshot
browser4-cli screenshot --full-page

# Save the current page as a PDF
browser4-cli pdf --filename=page.pdf

# Inspect tab indices before switching tabs
browser4-cli tab-list
browser4-cli tab-select 1
browser4-cli tab-close 1

# Use a custom server URL
browser4-cli open --server http://localhost:9090

# Advanced: execute multiple commands in one process (batch mode)
# Batch mode only supports DOM operations. You must run `open` separately first.
browser4-cli open
browser4-cli batch "goto https://browser4.io" "snapshot"

# Advanced: stop on the first batch failure
browser4-cli batch --bail "goto https://browser4.io" "click e1" "screenshot"

# Advanced: batch mode for form filling (recommended use case)
browser4-cli batch "fill e1 'John Doe'" "fill e2 'john@example.com'" "click e3"

# Advanced: pipe batch commands as JSON via stdin
echo '[
  ["goto", "https://example.com/form-filling"],
  ["click", "#reset-btn"],
  ["fill", "#first-name", "Bob"],
  ["fill", "#last-name", "Smith"],
  ["fill", "#email", "bob@example.com"],
  ["select", "#country", "uk"],
  ["check", "#agree-terms"],
  ["click", "#submit-btn"]
]' | browser4-cli batch --json

# Close the session when done
browser4-cli close

# Close all sessions but keep the current Browser4 backend running
browser4-cli close-all

# Explicitly stop the Browser4 backend and clean up tracked Browser4 processes
browser4-cli kill-all

Architecture

The Rust CLI is structured as follows:

Module Purpose
main.rs Entry point, command dispatch, session management
args.rs CLI argument parsing (global flags, positional args, options)
commands.rs Command definitions mapping to MCP tool names and parameters
http.rs HTTP client for calling /mcp/call-tool
state.rs Persistent state management for the default state file and named session files under ~/.browser4/
daemon.rs Local server auto-start (prefer Maven from repo root, fall back to jar) and health checking
managed_processes.rs Registry for browser4 server processes
snapshot.rs Snapshot and screenshot file helpers
help.rs Help text generation

Testing

## Run all tests (unit + end-to-end):
cargo test

## Run only the end-to-end tests and print their output:
cargo test --test e2e -- --nocapture

## Run a specific end-to-end test scenario:
cargo test --test e2e -- --nocapture --scenario=test_e2e_batch_form_submission

Publishing the CLI package

For maintainers, the CLI package now uses an npm version guard before publish.

The GitHub release workflow publishes the npm package via npm trusted publishing (GitHub Actions OIDC) instead of NODE_AUTH_TOKEN. This avoids CI failures caused by npm one-time-password challenges (EOTP).

  • Local release entrypoint: npm run release
  • Direct guarded publish entrypoint: npm run publish:if-needed
  • GitHub release workflow: re-checks npm immediately before the publish step

If the local version in cli/package.json already matches the version currently published on npm, the publish step is skipped automatically.

Examples:

# Check whether npm publish is needed
node scripts/check-npm-publish-needed.js --json

# Publish only when the local version differs from npm
npm run publish:if-needed

# Standard maintainer release command (also guarded)
npm run release

For local testing, you can override the detected remote version:

BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/check-npm-publish-needed.js --json
BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/publish-if-needed.js --dry-run

License

Apache-2.0

Keywords