0.1.25 • Published yesterdayCLI

browser4-cli

Licence

Apache-2.0

Version

0.1.25

Deps

Size

74.3 MB

Vulns

Weekly

1.1K

Install scriptsThis package runs scripts during installation (preinstall/install/postinstall)

Summary Dependency Versions

Browser4

Make websites accessible for AI agents. Automate tasks online with ease.

Installation

Global Installation (recommended)

Installs the native Rust binary:

npm install -g browser4-cli

After installation, use browser4-cli. The shorter browser4 command remains available as a compatibility alias.

Project Installation (local dependency)

For projects that want to pin the version in package.json:

npm install browser4-cli

Then use via package.json scripts or by invoking browser4-cli directly.

Standalone Installer Scripts (no npm needed)

Bootstrap the native binary directly with a single command:

Windows (PowerShell):

Invoke-WebRequest -Uri "https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.ps1" -OutFile "$env:TEMP\install-browser4-cli.ps1"
powershell -ExecutionPolicy Bypass -File "$env:TEMP\install-browser4-cli.ps1"

Linux / macOS (bash):

curl -fsSL https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.sh | bash

Or run the scripts locally from a cloned repo:

cli/scripts/install-browser4-cli.ps1 (Windows)
cli/scripts/install-browser4-cli.sh (Linux / macOS / Git Bash)

See Standalone CLI Installer Scripts for the full option reference, supported platforms, and examples.

From Source

Build prerequisites: Rust (stable, edition 2021), Node.js 24+, pnpm 10+, git.

git clone https://github.com/platonai/Browser4.git
cd Browser4/cli/browser4-cli
pnpm install
pnpm build:native   # Compiles the Rust binary (requires https://rustup.rs)
pnpm link --global  # Makes browser4-cli available globally

Cross-compilation from Linux (for release builds) additionally requires: cargo-zigbuild, Zig 0.13.0, gcc-aarch64-linux-gnu, and mingw-w64. See cli/docker/Dockerfile.build for a Dockerized build environment.

Requirements

Chrome — Latest Chrome installed on your system. The CLI can auto-install Chrome on most platforms when missing.
Java 17+ — Required to run the Browser4 backend (Browser4.jar). Eclipse Temurin recommended. JDK 21+ enables best jlink compression when the CLI auto-builds a runtime bundle from source.
Rust — Only needed when building the CLI from source (see From Source above). The stable toolchain (edition 2021) is sufficient.

Additional requirements for auto-building a runtime bundle from source

When the CLI detects a Browser4 repository checkout, it attempts to build a self-contained runtime bundle (bundled JRE + dependency JARs) from source instead of downloading a pre-built release. This requires:

Tool	Version	Linux	macOS	Windows
Maven	3.9+	via `mvnw` wrapper	via `mvnw` wrapper	via `mvnw.cmd` wrapper
JDK tools (`jdeps`, `jlink`)	bundled with JDK 16+	included in JDK	included in JDK	included in JDK
PowerShell 7 (`pwsh`)	7.0+	required	required	built-in (`powershell.exe`)
tar	any	required	required	built-in

Set BROWSER4_CLI_FORCE_REMOTE_BUNDLE=1 to skip the local build and always download a pre-built bundle — useful in CI / corporate environments where Maven or jlink are unavailable.

Usage

browser4-cli <command> [args] [options]
browser4-cli -s=<session> <command> [args] [options]

Global options

Flag	Description
`-h`, `--help [command]`	Print help (optionally for a specific command)
`-v`, `--version`	Print version
`-s=<name>`	Named session label
`--server=<url>`	Override Browser4 server URL
`--proxy=<url>`	Manual HTTP proxy override for downloads
`--json`	Emit machine-parseable JSON to stdout
`-q`, `--quiet`	Suppress normal output, only show errors

--json switches every command's stdout from human-readable text to a single-line JSON envelope ({"status":"ok","command":"<name>","output":{...}}). Omit --json for the default human-readable output.

--proxy=<url> sets a manual HTTP proxy used only for downloading the Browser4 runtime bundle. It does not affect browser traffic.

-q / --quiet suppresses all normal stdout output. Errors and progress messages still go to stderr. Combine with --json for silent-on-success scripting: browser4-cli --json -q open.

Sessions are persisted independently per name. Omitting -s uses the default session (~/.browser4/cli-state.json). With -s=<name>, a separate state file is stored under ~/.browser4/sessions/<name>.json. open without -s reuses the default session if one exists; with -s=<name> it switches to or creates the named session.

Commands

The tables below mirror the commands surfaced by the global browser4-cli help overview.

Core

Command	Description
`open [url]`	Open or switch to a browser session. Supports `--headed` (force visible window), `--headless` (force headless), `--profile=<path>`, `--profile-mode=<mode>` (temporary / sequential / default), and `--interact-level=<level>` (FASTEST / FAST / DEFAULT).
`close`	Close the active session
`goto <url>`	Navigate to a URL, auto-opening or refreshing the session if needed
`click <ref> [button]`	Click an element. Supports `--modifiers` (modifier keys to press).
`dblclick <ref> [button]`	Double-click an element. Supports `--modifiers` (modifier keys to press).
`type <text> [ref]`	Type text into the focused element or an optional target element. Supports `--submit` (press Enter after typing).
`fill <ref> <text>`	Fill text into an editable element. Supports `--submit` (press Enter after filling).
`hover <ref>`	Hover over an element
`select <ref> <val>`	Select an option in a dropdown
`upload <ref> <file>`	Upload a file
`check <ref>`	Check a checkbox or radio button
`uncheck <ref>`	Uncheck a checkbox or radio button
`drag <startRef> <endRef>`	Drag and drop between two elements
`snapshot`	Capture accessibility snapshot. Supports `--filename=<path>` to save to file, `--boxes` for bounding boxes, `-i`/`--interactive` for interactive-only, `-u`/`--urls` for link URLs, `-c`/`--compact` for compact output, `-d`/`--depth <n>` for depth limiting, `-s`/`--selector <sel>` for CSS scoping, and `--raw` for raw output without page info.
`eval <expression> [ref]`	Evaluate JavaScript on the page or a target element. Use `--file=<path>` to read the expression from a file.
`get <mode> <selector> [name]`	Extract data from a page element. Modes: `text`, `html`, `box`, `styles`, `property`, `attr`. `name` is required for `property` and `attr`.
`scroll <direction> <pixels>`	Scroll the page. Direction: `up`, `down`, `left`, or `right`.
`wait [target]`	Wait for a condition. Positional: selector or milliseconds. Options: `--text=<text>`, `--url=<glob>`, `--load=<state>` (networkidle/domcontentloaded), `--fn=<JS expr>`.
`dialog-accept [prompt]`	Accept a dialog
`dialog-dismiss`	Dismiss a dialog
`resize <w> <h>`	Resize the browser window
`delete-data`	Delete session data

Command	Description
`go-back`	Go back to the previous page
`go-forward`	Go forward to the next page
`reload`	Reload the current page

Keyboard

Command	Description
`press <key> [ref]`	Press a key on the focused element or an optional target element
`keydown <key>`	Press and hold a key
`keyup <key>`	Release a key

Mouse

Command	Description
`mousemove <x> <y>`	Move mouse to coordinates
`mousedown [button]`	Press mouse button
`mouseup [button]`	Release mouse button
`mousewheel <dx> <dy>`	Scroll the mouse wheel

Screenshots

Command	Description
`screenshot [ref]`	Take a screenshot (optionally of a specific element). Supports `--filename=<path>` and `--full-page` (full scrollable page).
`pdf`	Save the current page as a PDF. Supports `--filename=<path>`.

Tabs

Command	Description
`tab-list`	List all tabs with their zero-based index
`tab-new [url]`	Create a new tab
`tab-close [index]`	Close a tab by its zero-based index
`tab-select <index>`	Select a tab by its zero-based index

Run tab-list first to find each tab's zero-based index. Pass that index to tab-select or tab-close.

Browser storage

Command	Description
`state-save <path>`	Save cookies and localStorage to a JSON file
`state-load <path>`	Restore cookies and localStorage from a saved state file
`cookie-list`	List all cookies (optionally filtered by `--domain` / `--path`)
`cookie-get <name>`	Get a cookie by name
`cookie-set <name> <value>`	Set a cookie. Supports `--path`, `--domain`, `--expires`, `--httpOnly`, `--secure`, `--sameSite` (Strict / Lax / None).
`cookie-delete <name>`	Delete a cookie by name
`cookie-clear`	Clear all cookies for the current page
`localstorage-list`	List all localStorage entries
`localstorage-get <key>`	Get a localStorage value by key
`localstorage-set <key> <value>`	Set a localStorage key-value pair
`localstorage-delete <key>`	Delete a localStorage key
`localstorage-clear`	Clear all localStorage entries
`sessionstorage-list`	List all sessionStorage entries
`sessionstorage-get <key>`	Get a sessionStorage value by key
`sessionstorage-set <key> <value>`	Set a sessionStorage key-value pair
`sessionstorage-delete <key>`	Delete a sessionStorage key
`sessionstorage-clear`	Clear all sessionStorage entries

Browser sessions

Command	Description
`list`	List browser sessions. Supports `--all` (list sessions across all workspaces).
`close-all`	Close all browser sessions without stopping `Browser4.jar` / the Browser4 backend
`kill-all`	Forcefully stop `Browser4.jar` / the Browser4 backend and kill Browser4 browser processes

Use close-all for session cleanup when you want to keep the current Browser4 service running. Use kill-all only when you explicitly want to stop the backend and clean up tracked Browser4 processes.

Server management

Command	Description
`install`	Download the Browser4 runtime bundle. Supports `--tag=<version>` to pin a release and `--force` to reinstall even when already present.
`upgrade`	Upgrade the Browser4 runtime bundle to the latest version or a specified `--tag`
`uninstall`	Remove globally installed browser4-cli (npm) and its runtime data. Supports `-y` / `--yes` (skip confirmation) and `--dry-run` (preview).
`stop`	Kill the Browser4 backend after closing all sessions
`status`	Check whether the Browser4 backend is reachable and healthy

install and upgrade both manage the Browser4 runtime bundle — a self-contained distribution that includes all dependency jars, a minimal jlink-built JRE, and platform launcher scripts. Neither requires cargo or a Rust toolchain; the runtime is a Java application downloaded from GitHub Releases.

Use --tag=<version> to pin a specific release (e.g. --tag=v4.9.3). Use --force to reinstall even when the same version is already present.

When a local Browser4 checkout is detected with the browser4-bundle module present, install and upgrade auto-build the runtime bundle from source (via Maven) instead of downloading.

uninstall attempts to remove browser4-cli from npm global packages and deletes the Browser4 runtime data and cache directories. It prompts for confirmation unless -y / --yes is passed. Use --dry-run to preview what would be removed without making changes. Does not require a running server.

browser4-cli uninstall
browser4-cli uninstall -y
browser4-cli uninstall --dry-run

Advanced commands

These commands are intentionally omitted from the global browser4-cli help overview. Query browser4-cli help <command> for the exact syntax when you need them.

Command	Description
`batch [command...]`	Execute multiple commands in one invocation. Only DOM operations are supported (Core, Navigation, Keyboard, Mouse, Export, Tabs categories). Commands like `open`, `close`, `list`, `agent run`, etc. are not allowed in batch mode.
`console [min-level]`	List console messages
`extract <instruction>`	Extract structured data from the current page. Supports `--schema` (JSON Schema for typed output).
`summarize [instruction]`	Summarize page content using AI
`agent run <task>`	Run an autonomous agent task
`agent status <id>`	Check the status of a running agent task
`agent result <id>`	Get the result of a completed agent task
`swarm create`	Create a swarm scrape session with parallel browser contexts
`swarm submit [url]`	Submit URL(s) or raw X-SQL payloads as scrape jobs
`swarm query <url>`	Run an X-SQL query against a loaded webpage
`swarm status <id>`	Check the status of a scrape or query job
`swarm result <id>`	Get the result of a completed job
`crawl <url>`	Crawl a website from a seed URL, following links up to a configurable depth

Agent task workflow (`agent <subcommand>`)

The agent-* commands wrap the backend command agent's asynchronous task API. They are useful when you want Browser4 to plan and execute a natural-language task in the background instead of issuing one low-level browser action at a time.

Like other advanced commands, they are intentionally omitted from the global browser4-cli help overview. Query browser4-cli help agent run (or agent status / agent result) when you need the exact syntax.

Use the spaced agent <subcommand> form:

browser4-cli agent run "Open browser4.io and summarize the hero section"
browser4-cli agent status agent-task-1
browser4-cli agent result agent-task-1

Command lifecycle

Step	Command	What it does
1	`agent run <task>`	Submits an asynchronous natural-language task through `command_run` and prints the returned task ID
2	`agent status <id>`	Fetches the latest task status payload through `command_status`
3	`agent result <id>`	Fetches the completed task result payload through `command_result`

Notes

agent run is asynchronous: it returns immediately after the backend accepts the task and prints a follow-up agent status command with the generated task ID.
agent status prints the backend status payload as-is. In practice this is a JSON object that commonly includes fields such as id, status, statusCode, processState, message, agentState, agentHistory, and commandResult.
agent result prints the backend result payload as-is. Depending on the task, it may be plain text or structured JSON.
These commands are task-ID based and do not require an active CLI browser session slot. The global -s=<name> option is therefore usually not relevant for agent-* follow-up calls.
agent subcommands are not supported inside batch mode.
agent run performs a short post-submit status probe so obvious missing-LLM configuration failures can be surfaced immediately instead of leaving you with a task ID that will never succeed.

Use cases

1. Submit an autonomous agent task

browser4-cli agent run "Open browser4.io and summarize the hero section"

Typical output:

Task submitted: agent-task-1
Use 'browser4-cli agent status agent-task-1' to check progress.

2. Poll task progress

browser4-cli agent status agent-task-1

Example status payload:

{"id":"agent-task-1","status":"RUNNING"}

On a real Browser4 backend the payload can be richer and may include lifecycle details such as processState, agent history snapshots, or an embedded partial commandResult.

3. Read the final result

browser4-cli agent result agent-task-1

If the backend returns a structured CommandResult, expect fields such as summary, pageSummary, fields, links, or xsqlResultSet.

Swarm scrape workflow (`swarm <subcommand>`)

The swarm subcommands support a swarm scrape workflow where one CLI session coordinates multiple browser contexts in the Browser4 backend.

Command overview

Command	Purpose	Backend endpoint
`swarm create`	Create a swarm scrape session	`POST /api/swarm`
`swarm submit <url>`	Scrape URLs or submit raw X-SQL	`POST /api/swarm/submit`
`swarm query <url>`	Run X-SQL queries against loaded pages	`POST /api/swarm/query`
`swarm status <id>`	Poll job status	`GET /api/swarm/{id}/status`
`swarm result <id>`	Fetch completed job result	`GET /api/swarm/{id}/result`
`crawl <url>`	Start a website crawl	`POST /api/crawl`
(poll)	Track crawl progress	`GET /api/crawl/{id}/result`

URL scraping with `swarm submit`

# create a session
browser4-cli swarm create \
  --profile-mode=TEMPORARY \
  --max-open-tabs=12 \
  --max-browser-contexts=3 \
  --display-mode=HEADLESS

# submit URLs as scrape jobs
browser4-cli swarm submit https://example.com/direct \
  --seed-file=./swarm-seeds.txt \
  --deadline=2026-03-30T00:00:00Z \
  --expires=1d \
  --refresh --store-content

# poll and fetch the result
browser4-cli swarm status scrape-task-4
browser4-cli swarm result scrape-task-4

X-SQL queries with `swarm query`

Run structured X-SQL queries against loaded webpages to extract data.

# Inline query:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql "
  SELECT
    dom_base_uri(dom) AS url,
    dom_first_text(dom, '#productTitle') AS title,
    dom_first_slim_html(dom, 'img:expr(width > 400)') AS img
  FROM load_and_select(@url, 'body');
"

# From a file:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql @query.sql

# With seed file and load options:
browser4-cli swarm query --sql @query.sql --seed-file=./urls.txt --refresh

Notes

swarm create accepts backend capability hints: --profile-mode, --max-open-tabs, --max-browser-contexts, --display-mode.
swarm submit and swarm query both accept a positional URL, --seed-file, or both. Seed files use one URL per line; # comments and blank lines are ignored.
Both commands support load-option flags: --deadline, --expires, --refresh, --parse, --store-content.
swarm query --sql is required; swarm submit --sql also works as a convenience. Use @url in the X-SQL template; it is replaced with the target URL server-side.
Prefix the --sql value with @ to read from a file (e.g. --sql @query.sql).
All commands return a task ID; use swarm status / swarm result to track progress.

Crawl (`crawl`)

Recursive website crawling from a seed URL.

Command overview

Command	Purpose	Backend endpoint
`crawl <url>`	Crawl a website from a seed URL	`POST /api/crawl`
(poll)	Track crawl progress	`GET /api/crawl/{id}/result`

Basic crawling

# depth=1: extract all links from homepage, load each linked page
browser4-cli crawl "https://example.com" --out-link-selector "a[href]"

# depth=2: follow links two levels deep, filter by pattern
browser4-cli crawl "https://shop.example.com" \
  --depth 2 \
  --out-link-selector "a.product-link" \
  --out-link-pattern "/product/" \
  --top-links 10

# with LoadOptions passthrough
browser4-cli crawl "https://example.com" -ol "a[href]" -a "-refresh -nMaxRetry 5"

Key flags

Flag	Default	Description
`-d`, `--depth`	`1`	Maximum crawl depth
`-ol`, `--out-link-selector`	—	CSS selector to extract links from each page
`-olp`, `--out-link-pattern`	`.+`	Regex pattern to filter extracted links
`-tl`, `--top-links`	`20`	Maximum links to extract per page
`-a`, `--args`	—	Additional LoadOptions passthrough
`--refresh`	—	Force a fresh fetch, ignoring cache
`--parse`	—	Parse each page immediately after fetching
`--expires`	—	Cache expiration duration
`--store-content`	—	Persist page content to storage
`--page-load-timeout`	—	Maximum time to wait for page load
`--readonly`	—	Non-destructive mode

Notes

All LoadOptions flags available via --args passthrough (e.g. -a "-nMaxRetry 5 -lazyFlush").
Depth=1 reuses PulsarSession.submitForOutPages; depth>1 uses BFS continuous crawl with visited-URL dedup.
CLI timeout: 600s default, configurable via BROWSER4_CLI_CRAWL_TIMEOUT_SECS.
Duplicate URLs are skipped (normalized: lowercase, no trailing slash, no query string).

DOM Snapshot commands (`domsnapshot <subcommand>`)

The domsnapshot commands capture and query a static DOM snapshot of the current page. Unlike the real-time accessibility snapshot, a DOM snapshot is a full HTML capture that can be queried repeatedly with CSS selectors and X-SQL without re-fetching the page.

Use the spaced domsnapshot <subcommand> form:

browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot query --sql "SELECT dom_first_text(dom, '#title') AS title FROM dom(dom)"
browser4-cli domsnapshot export --file=page.html

Command overview

Command	Description
`domsnapshot`	Capture a static DOM snapshot of the current page
`domsnapshot get <field> [selector] [name]`	Extract elements from the static DOM snapshot
`domsnapshot query [url] --sql=<query>`	Run X-SQL against the DOM snapshot via the scrape API
`domsnapshot export [--file=<path>]`	Save full snapshot HTML content to a local file
`domsnapshot grep [OPTIONS] <pattern>`	Search snapshot HTML with regex and grep-style output

`domsnapshot grep` flags

Line numbers are shown by default (unlike GNU grep). Use --no-line-number to suppress them.

Flag	Description
`-i`	Case-insensitive matching
`-A N`, `-B N`, `-C N`	Context lines after / before / around each match
`-v`	Invert match (select non-matching lines)
`-c`	Print only count of matching lines
`-l`	Print only whether matches exist (prints "domsnapshot" if matches found; use with `\| grep -q domsnapshot` for CI pass/fail)
`-F`	Treat pattern as a literal string
`-w`	Match whole words only
`--no-line-number`	Suppress line numbers in output
`--selector <CSS>`	Scope search to a CSS selector

# Basic search
browser4-cli domsnapshot grep -i error

# With context and literal match
browser4-cli domsnapshot grep -F -C 2 "404 Not Found"

# Count matches
browser4-cli domsnapshot grep -c "div"

# Search within a specific element
browser4-cli domsnapshot grep --selector main "Submit"

`domsnapshot get` fields

Field	Description
`text`	Extract text content from the matched element
`html`	Extract inner HTML from the matched element
`attr`	Extract an attribute value (requires `name`, e.g. `attr "#link" "href"`)

selector defaults to :root when omitted. name (attribute name) is required for the attr field.

`domsnapshot query` with X-SQL

# Inline X-SQL:
browser4-cli domsnapshot query "https://example.com" --sql "
  SELECT
    dom_base_uri(dom) AS url,
    dom_first_text(dom, 'h1') AS title
  FROM dom(dom);
"

# From a file:
browser4-cli domsnapshot query --sql @query.sql

--sql is required. Prefix with @ to read from a file (e.g. --sql @query.sql).
url is optional and defaults to the current session's page URL.
These commands do not require an active swarm session.

Notes

domsnapshot commands are not supported inside batch mode.
Use domsnapshot get for quick element extraction; use domsnapshot query with X-SQL for structured multi-element extraction.
The DOM snapshot is cached in the backend until the next domsnapshot capture or session navigation.

Element References

The snapshot command returns an accessibility tree where every interactive node is labeled with a short identifier such as e15. Pass this identifier directly to commands like click, type, or press. You can also use plain CSS selectors (e.g. .my-button, #search-input).

State Persistence

browser4-cli persists CLI state between invocations under ~/.browser4 by default. Override the root directory with the BROWSER4_CLI_STATE_DIR environment variable.

Default session state: ~/.browser4/cli-state.json
Named session state (-s=<name>): ~/.browser4/sessions/<name>.json

Each state file stores the current Browser4 server URL plus session-scoped fields such as:

sessionId — active Browser4 session ID
baseUrl — Browser4 backend URL used by the CLI
activeSelector — last selector tracked for keyboard restore flows
lastMousePosition — last pointer coordinates tracked for mouse restore flows

Session state transitions

Command	Behavior
`open`	Creates a new session, or reuses an existing active one. Stale sessions are automatically refreshed.
`open -s=<name>`	Same as `open` but scoped to a named session slot.
`goto <url>`	Reuses the current session if active; otherwise opens a fresh one before navigating.
`close`	Closes the current session (no-op if none active).
`close-all` / `kill-all` / `stop`	Clears all persisted session state.

The list command shows each session's status: Active (backend confirms), Stale (backend has stopped it), or Unknown (backend unreachable).

Runtime Temp Files

browser4-cli keeps ephemeral runtime artifacts under the system temp directory:

Windows: %TEMP%\browser4\browser4-cli
Linux/macOS: ${TMPDIR:-/tmp}/browser4/browser4-cli

This temp subtree contains items such as:

startup logs for auto-started Browser4 servers
staged Maven wrapper launchers
Rust test scratch directories used by browser4-cli tests

Persistent CLI state remains under ~/.browser4 by default. The Browser4 runtime bundle (JRE, JARs, launchers) is stored separately in a platform-conventional data directory so that clearing CLI session state does not require re-downloading the ~200 MB runtime:

Linux: ~/.local/share/browser4/runtime/<version>/
macOS: ~/Library/Application Support/browser4/runtime/<version>/
Windows: %APPDATA%/browser4/runtime/<version>/

The current.tag file in the runtime/ directory records the active version. Override the runtime data root with the BROWSER4_RUNTIME_DIR environment variable. Downloaded archives are cached under the platform cache directory (~/.cache/browser4/downloads/ on Linux, ~/Library/Caches/browser4/downloads/ on macOS, %LOCALAPPDATA%/browser4/downloads/ on Windows).

Snapshots

After each command that modifies browser state, the CLI automatically:

Retrieves the current page URL and title
Captures an accessibility snapshot
Saves the snapshot to .browser4-cli/snapshot/page-<timestamp>.yml
Prints the snapshot path in Markdown link format

Examples

# Open a new browser window (defaults to headed)
browser4-cli open

# Open in headed or headless mode
browser4-cli open --headed https://browser4.io
browser4-cli open --headless https://browser4.io

# Open with a persistent profile
browser4-cli open --profile=~/.browser4/profiles/work

# Open with a temporary (single-use) profile
browser4-cli open --profile-mode=temporary https://browser4.io

# Open with interact-level control (FASTEST / FAST / DEFAULT)
browser4-cli open --interact-level=FAST https://browser4.io

# Navigate to a page — auto-opens a session if none is active
browser4-cli goto https://browser4.io

# Inspect the page — note the eN labels on interactive nodes
browser4-cli snapshot

# Include element bounding boxes in the snapshot
browser4-cli snapshot --boxes

# Show only interactive elements with compact output, depth-limited
browser4-cli snapshot -i -c -d 5

# Scope snapshot to a specific CSS selector
browser4-cli snapshot -s "#main-content"

# Raw output — no page info, just the snapshot content (useful for piping)
browser4-cli snapshot --raw | grep "button"

# Capture a static DOM snapshot and extract data
browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot get attr "#productTitle" "href"
browser4-cli domsnapshot export --file=page.html

# Interact using refs from the snapshot
browser4-cli click e15
browser4-cli type "Hello World" e15
browser4-cli press Enter e15
browser4-cli eval "document.title"
browser4-cli eval "element => element.textContent" e15
browser4-cli keydown Shift
browser4-cli mousemove 150 300
browser4-cli mousewheel 0 100
browser4-cli keyup Shift

# Take a screenshot and save it to disk
browser4-cli screenshot

# Take a full-page screenshot
browser4-cli screenshot --full-page

# Save the current page as a PDF
browser4-cli pdf --filename=page.pdf

# Inspect tab indices before switching tabs
browser4-cli tab-list
browser4-cli tab-select 1
browser4-cli tab-close 1

# Use a custom server URL
browser4-cli open --server http://localhost:9090

# Advanced: execute multiple commands in one process (batch mode)
# Batch mode only supports DOM operations. You must run `open` separately first.
browser4-cli open
browser4-cli batch "goto https://browser4.io" "snapshot"

# Advanced: stop on the first batch failure
browser4-cli batch --bail "goto https://browser4.io" "click e1" "screenshot"

# Advanced: batch mode for form filling (recommended use case)
browser4-cli batch "fill e1 'John Doe'" "fill e2 'john@example.com'" "click e3"

# Advanced: pipe batch commands as JSON via stdin
echo '[
  ["goto", "https://example.com/form-filling"],
  ["click", "#reset-btn"],
  ["fill", "#first-name", "Bob"],
  ["fill", "#last-name", "Smith"],
  ["fill", "#email", "bob@example.com"],
  ["select", "#country", "uk"],
  ["check", "#agree-terms"],
  ["click", "#submit-btn"]
]' | browser4-cli batch --json

# Close the session when done
browser4-cli close

# Close all sessions but keep the current Browser4 backend running
browser4-cli close-all

# Explicitly stop the Browser4 backend and clean up tracked Browser4 processes
browser4-cli kill-all

Architecture

The Rust CLI is structured as follows:

Module	Purpose
`main.rs`	Entry point, command dispatch, session management
`args.rs`	CLI argument parsing (global flags, positional args, options)
`commands.rs`	Command definitions mapping to MCP tool names and parameters
`http.rs`	HTTP client for calling `/mcp/call-tool`
`state.rs`	Persistent state management for the default state file and named session files under `~/.browser4/`
`daemon.rs`	Local server auto-start (prefer Maven from repo root, fall back to jar) and health checking
`managed_processes.rs`	Registry for browser4 server processes
`snapshot.rs`	Snapshot and screenshot file helpers
`help.rs`	Help text generation

Testing

## Run all tests (unit + end-to-end):
cargo test

## Run only the end-to-end tests and print their output:
cargo test --test e2e -- --nocapture

## Run a specific end-to-end test scenario:
cargo test --test e2e -- --nocapture --scenario=test_e2e_batch_form_submission

Publishing the CLI package

For maintainers, the CLI package now uses an npm version guard before publish.

The GitHub release workflow publishes the npm package via npm trusted publishing (GitHub Actions OIDC) instead of NODE_AUTH_TOKEN. This avoids CI failures caused by npm one-time-password challenges (EOTP).

Local release entrypoint: npm run release
Direct guarded publish entrypoint: npm run publish:if-needed
GitHub release workflow: re-checks npm immediately before the publish step

If the local version in cli/package.json already matches the version currently published on npm, the publish step is skipped automatically.

Examples:

# Check whether npm publish is needed
node scripts/check-npm-publish-needed.js --json

# Publish only when the local version differs from npm
npm run publish:if-needed

# Standard maintainer release command (also guarded)
npm run release

For local testing, you can override the detected remote version:

BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/check-npm-publish-needed.js --json
BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/publish-if-needed.js --dry-run

License

Apache-2.0

Keywords

browser browser4 browser4-cli automation headless chrome cdp cli agent