Browser4
Make websites accessible for AI agents. Automate tasks online with ease.
Installation
Global Installation (recommended)
Installs the native Rust binary:
npm install -g browser4-cliAfter installation, use browser4-cli. The shorter browser4 command remains
available as a compatibility alias.
Project Installation (local dependency)
For projects that want to pin the version in package.json:
npm install browser4-cliThen use via package.json scripts or by invoking browser4-cli directly.
Standalone Installer Scripts (no npm needed)
Bootstrap the native binary directly with a single command:
Windows (PowerShell):
Invoke-WebRequest -Uri "https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.ps1" -OutFile "$env:TEMP\install-browser4-cli.ps1"
powershell -ExecutionPolicy Bypass -File "$env:TEMP\install-browser4-cli.ps1"
Linux / macOS (bash):
curl -fsSL https://browser4.oss-cn-beijing.aliyuncs.com/scripts/install-browser4-cli.sh | bashOr run the scripts locally from a cloned repo:
cli/scripts/install-browser4-cli.ps1(Windows)cli/scripts/install-browser4-cli.sh(Linux / macOS / Git Bash)
See Standalone CLI Installer Scripts for the full option reference, supported platforms, and examples.
From Source
Build prerequisites: Rust (stable, edition 2021), Node.js 24+, pnpm 10+, git.
git clone https://github.com/platonai/Browser4.git
cd Browser4/cli/browser4-cli
pnpm install
pnpm build:native # Compiles the Rust binary (requires https://rustup.rs)
pnpm link --global # Makes browser4-cli available globallyCross-compilation from Linux (for release builds) additionally requires:
cargo-zigbuild, Zig 0.13.0, gcc-aarch64-linux-gnu, and mingw-w64.
See cli/docker/Dockerfile.build for a Dockerized build environment.
Requirements
- Chrome — Latest Chrome installed on your system. The CLI can auto-install Chrome on most platforms when missing.
- Java 17+ — Required to run the Browser4 backend (
Browser4.jar). Eclipse Temurin recommended. JDK 21+ enables best jlink compression when the CLI auto-builds a runtime bundle from source. - Rust — Only needed when building the CLI from source (see From Source above). The
stabletoolchain (edition 2021) is sufficient.
Additional requirements for auto-building a runtime bundle from source
When the CLI detects a Browser4 repository checkout, it attempts to build a self-contained runtime bundle (bundled JRE + dependency JARs) from source instead of downloading a pre-built release. This requires:
| Tool | Version | Linux | macOS | Windows |
|---|---|---|---|---|
| Maven | 3.9+ | via mvnw wrapper |
via mvnw wrapper |
via mvnw.cmd wrapper |
JDK tools (jdeps, jlink) |
bundled with JDK 16+ | included in JDK | included in JDK | included in JDK |
PowerShell 7 (pwsh) |
7.0+ | required | required | built-in (powershell.exe) |
| tar | any | required | required | built-in |
Set BROWSER4_CLI_FORCE_REMOTE_BUNDLE=1 to skip the local build and always
download a pre-built bundle — useful in CI / corporate environments where
Maven or jlink are unavailable.
Usage
browser4-cli <command> [args] [options]
browser4-cli -s=<session> <command> [args] [options]
Global options
| Flag | Description |
|---|---|
-h, --help [command] |
Print help (optionally for a specific command) |
-v, --version |
Print version |
-s=<name> |
Named session label |
--server=<url> |
Override Browser4 server URL |
--proxy=<url> |
Manual HTTP proxy override for downloads |
--json |
Emit machine-parseable JSON to stdout |
-q, --quiet |
Suppress normal output, only show errors |
--json switches every command's stdout from human-readable text to a
single-line JSON envelope ({"status":"ok","command":"<name>","output":{...}}).
Omit --json for the default human-readable output.
--proxy=<url> sets a manual HTTP proxy used only for downloading the
Browser4 runtime bundle. It does not affect browser traffic.
-q / --quiet suppresses all normal stdout output. Errors and
progress messages still go to stderr. Combine with --json for
silent-on-success scripting: browser4-cli --json -q open.
Sessions are persisted independently per name. Omitting -s uses the
default session (~/.browser4/cli-state.json). With -s=<name>, a
separate state file is stored under ~/.browser4/sessions/<name>.json.
open without -s reuses the default session if one exists; with
-s=<name> it switches to or creates the named session.
Commands
The tables below mirror the commands surfaced by the global browser4-cli help overview.
Core
| Command | Description |
|---|---|
open [url] |
Open or switch to a browser session. Supports --headed (force visible window), --headless (force headless), --profile=<path>, --profile-mode=<mode> (temporary / sequential / default), and --interact-level=<level> (FASTEST / FAST / DEFAULT). |
close |
Close the active session |
goto <url> |
Navigate to a URL, auto-opening or refreshing the session if needed |
click <ref> [button] |
Click an element. Supports --modifiers (modifier keys to press). |
dblclick <ref> [button] |
Double-click an element. Supports --modifiers (modifier keys to press). |
type <text> [ref] |
Type text into the focused element or an optional target element. Supports --submit (press Enter after typing). |
fill <ref> <text> |
Fill text into an editable element. Supports --submit (press Enter after filling). |
hover <ref> |
Hover over an element |
select <ref> <val> |
Select an option in a dropdown |
upload <ref> <file> |
Upload a file |
check <ref> |
Check a checkbox or radio button |
uncheck <ref> |
Uncheck a checkbox or radio button |
drag <startRef> <endRef> |
Drag and drop between two elements |
snapshot |
Capture accessibility snapshot. Supports --filename=<path> to save to file, --boxes for bounding boxes, -i/--interactive for interactive-only, -u/--urls for link URLs, -c/--compact for compact output, -d/--depth <n> for depth limiting, -s/--selector <sel> for CSS scoping, and --raw for raw output without page info. |
eval <expression> [ref] |
Evaluate JavaScript on the page or a target element. Use --file=<path> to read the expression from a file. |
get <mode> <selector> [name] |
Extract data from a page element. Modes: text, html, box, styles, property, attr. name is required for property and attr. |
scroll <direction> <pixels> |
Scroll the page. Direction: up, down, left, or right. |
wait [target] |
Wait for a condition. Positional: selector or milliseconds. Options: --text=<text>, --url=<glob>, --load=<state> (networkidle/domcontentloaded), --fn=<JS expr>. |
dialog-accept [prompt] |
Accept a dialog |
dialog-dismiss |
Dismiss a dialog |
resize <w> <h> |
Resize the browser window |
delete-data |
Delete session data |
Navigation
| Command | Description |
|---|---|
go-back |
Go back to the previous page |
go-forward |
Go forward to the next page |
reload |
Reload the current page |
Keyboard
| Command | Description |
|---|---|
press <key> [ref] |
Press a key on the focused element or an optional target element |
keydown <key> |
Press and hold a key |
keyup <key> |
Release a key |
Mouse
| Command | Description |
|---|---|
mousemove <x> <y> |
Move mouse to coordinates |
mousedown [button] |
Press mouse button |
mouseup [button] |
Release mouse button |
mousewheel <dx> <dy> |
Scroll the mouse wheel |
Screenshots
| Command | Description |
|---|---|
screenshot [ref] |
Take a screenshot (optionally of a specific element). Supports --filename=<path> and --full-page (full scrollable page). |
pdf |
Save the current page as a PDF. Supports --filename=<path>. |
Tabs
| Command | Description |
|---|---|
tab-list |
List all tabs with their zero-based index |
tab-new [url] |
Create a new tab |
tab-close [index] |
Close a tab by its zero-based index |
tab-select <index> |
Select a tab by its zero-based index |
Run tab-list first to find each tab's zero-based index. Pass that index to tab-select or tab-close.
Browser storage
| Command | Description |
|---|---|
state-save <path> |
Save cookies and localStorage to a JSON file |
state-load <path> |
Restore cookies and localStorage from a saved state file |
cookie-list |
List all cookies (optionally filtered by --domain / --path) |
cookie-get <name> |
Get a cookie by name |
cookie-set <name> <value> |
Set a cookie. Supports --path, --domain, --expires, --httpOnly, --secure, --sameSite (Strict / Lax / None). |
cookie-delete <name> |
Delete a cookie by name |
cookie-clear |
Clear all cookies for the current page |
localstorage-list |
List all localStorage entries |
localstorage-get <key> |
Get a localStorage value by key |
localstorage-set <key> <value> |
Set a localStorage key-value pair |
localstorage-delete <key> |
Delete a localStorage key |
localstorage-clear |
Clear all localStorage entries |
sessionstorage-list |
List all sessionStorage entries |
sessionstorage-get <key> |
Get a sessionStorage value by key |
sessionstorage-set <key> <value> |
Set a sessionStorage key-value pair |
sessionstorage-delete <key> |
Delete a sessionStorage key |
sessionstorage-clear |
Clear all sessionStorage entries |
Browser sessions
| Command | Description |
|---|---|
list |
List browser sessions. Supports --all (list sessions across all workspaces). |
close-all |
Close all browser sessions without stopping Browser4.jar / the Browser4 backend |
kill-all |
Forcefully stop Browser4.jar / the Browser4 backend and kill Browser4 browser processes |
Use close-all for session cleanup when you want to keep the current Browser4 service running. Use kill-all only when you explicitly want to stop the backend and clean up tracked Browser4 processes.
Server management
| Command | Description |
|---|---|
install |
Download the Browser4 runtime bundle. Supports --tag=<version> to pin a release and --force to reinstall even when already present. |
upgrade |
Upgrade the Browser4 runtime bundle to the latest version or a specified --tag |
uninstall |
Remove globally installed browser4-cli (npm) and its runtime data. Supports -y / --yes (skip confirmation) and --dry-run (preview). |
stop |
Kill the Browser4 backend after closing all sessions |
status |
Check whether the Browser4 backend is reachable and healthy |
install and upgrade both manage the Browser4 runtime bundle — a self-contained
distribution that includes all dependency jars, a minimal jlink-built JRE, and
platform launcher scripts. Neither requires cargo or a Rust toolchain; the runtime
is a Java application downloaded from GitHub Releases.
Use --tag=<version> to pin a specific release (e.g. --tag=v4.9.3). Use --force
to reinstall even when the same version is already present.
When a local Browser4 checkout is detected with the browser4-bundle module present,
install and upgrade auto-build the runtime bundle from source (via Maven) instead
of downloading.
uninstall attempts to remove browser4-cli from npm global packages
and deletes the Browser4 runtime data and cache directories. It prompts
for confirmation unless -y / --yes is passed. Use --dry-run to preview what
would be removed without making changes. Does not require a running server.
browser4-cli uninstall
browser4-cli uninstall -y
browser4-cli uninstall --dry-runAdvanced commands
These commands are intentionally omitted from the global browser4-cli help overview.
Query browser4-cli help <command> for the exact syntax when you need them.
| Command | Description |
|---|---|
batch [command...] |
Execute multiple commands in one invocation. Only DOM operations are supported (Core, Navigation, Keyboard, Mouse, Export, Tabs categories). Commands like open, close, list, agent run, etc. are not allowed in batch mode. |
console [min-level] |
List console messages |
extract <instruction> |
Extract structured data from the current page. Supports --schema (JSON Schema for typed output). |
summarize [instruction] |
Summarize page content using AI |
agent run <task> |
Run an autonomous agent task |
agent status <id> |
Check the status of a running agent task |
agent result <id> |
Get the result of a completed agent task |
swarm create |
Create a swarm scrape session with parallel browser contexts |
swarm submit [url] |
Submit URL(s) or raw X-SQL payloads as scrape jobs |
swarm query <url> |
Run an X-SQL query against a loaded webpage |
swarm status <id> |
Check the status of a scrape or query job |
swarm result <id> |
Get the result of a completed job |
crawl <url> |
Crawl a website from a seed URL, following links up to a configurable depth |
Agent task workflow (agent <subcommand>)
The agent-* commands wrap the backend command agent's asynchronous task API.
They are useful when you want Browser4 to plan and execute a natural-language
task in the background instead of issuing one low-level browser action at a
time.
Like other advanced commands, they are intentionally omitted from the global
browser4-cli help overview. Query browser4-cli help agent run (or
agent status / agent result) when you need the exact syntax.
Use the spaced agent <subcommand> form:
browser4-cli agent run "Open browser4.io and summarize the hero section"
browser4-cli agent status agent-task-1
browser4-cli agent result agent-task-1Command lifecycle
| Step | Command | What it does |
|---|---|---|
| 1 | agent run <task> |
Submits an asynchronous natural-language task through command_run and prints the returned task ID |
| 2 | agent status <id> |
Fetches the latest task status payload through command_status |
| 3 | agent result <id> |
Fetches the completed task result payload through command_result |
Notes
agent runis asynchronous: it returns immediately after the backend accepts the task and prints a follow-upagent statuscommand with the generated task ID.agent statusprints the backend status payload as-is. In practice this is a JSON object that commonly includes fields such asid,status,statusCode,processState,message,agentState,agentHistory, andcommandResult.agent resultprints the backend result payload as-is. Depending on the task, it may be plain text or structured JSON.- These commands are task-ID based and do not require an active CLI browser
session slot. The global
-s=<name>option is therefore usually not relevant foragent-*follow-up calls. agentsubcommands are not supported insidebatchmode.agent runperforms a short post-submit status probe so obvious missing-LLM configuration failures can be surfaced immediately instead of leaving you with a task ID that will never succeed.
Use cases
1. Submit an autonomous agent task
browser4-cli agent run "Open browser4.io and summarize the hero section"Typical output:
Task submitted: agent-task-1
Use 'browser4-cli agent status agent-task-1' to check progress.
2. Poll task progress
browser4-cli agent status agent-task-1Example status payload:
{"id":"agent-task-1","status":"RUNNING"}On a real Browser4 backend the payload can be richer and may include lifecycle
details such as processState, agent history snapshots, or an embedded partial
commandResult.
3. Read the final result
browser4-cli agent result agent-task-1If the backend returns a structured CommandResult, expect fields such as
summary, pageSummary, fields, links, or xsqlResultSet.
Swarm scrape workflow (swarm <subcommand>)
The swarm subcommands support a swarm scrape workflow where one CLI session
coordinates multiple browser contexts in the Browser4 backend.
Command overview
| Command | Purpose | Backend endpoint |
|---|---|---|
swarm create |
Create a swarm scrape session | POST /api/swarm |
swarm submit <url> |
Scrape URLs or submit raw X-SQL | POST /api/swarm/submit |
swarm query <url> |
Run X-SQL queries against loaded pages | POST /api/swarm/query |
swarm status <id> |
Poll job status | GET /api/swarm/{id}/status |
swarm result <id> |
Fetch completed job result | GET /api/swarm/{id}/result |
crawl <url> |
Start a website crawl | POST /api/crawl |
| (poll) | Track crawl progress | GET /api/crawl/{id}/result |
URL scraping with swarm submit
# create a session
browser4-cli swarm create \
--profile-mode=TEMPORARY \
--max-open-tabs=12 \
--max-browser-contexts=3 \
--display-mode=HEADLESS
# submit URLs as scrape jobs
browser4-cli swarm submit https://example.com/direct \
--seed-file=./swarm-seeds.txt \
--deadline=2026-03-30T00:00:00Z \
--expires=1d \
--refresh --store-content
# poll and fetch the result
browser4-cli swarm status scrape-task-4
browser4-cli swarm result scrape-task-4X-SQL queries with swarm query
Run structured X-SQL queries against loaded webpages to extract data.
# Inline query:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql "
SELECT
dom_base_uri(dom) AS url,
dom_first_text(dom, '#productTitle') AS title,
dom_first_slim_html(dom, 'img:expr(width > 400)') AS img
FROM load_and_select(@url, 'body');
"
# From a file:
browser4-cli swarm query "https://www.amazon.com/dp/B08PP5MSVB" --sql @query.sql
# With seed file and load options:
browser4-cli swarm query --sql @query.sql --seed-file=./urls.txt --refreshNotes
swarm createaccepts backend capability hints:--profile-mode,--max-open-tabs,--max-browser-contexts,--display-mode.swarm submitandswarm queryboth accept a positional URL,--seed-file, or both. Seed files use one URL per line;#comments and blank lines are ignored.- Both commands support load-option flags:
--deadline,--expires,--refresh,--parse,--store-content. swarm query --sqlis required;swarm submit --sqlalso works as a convenience. Use@urlin the X-SQL template; it is replaced with the target URL server-side.- Prefix the
--sqlvalue with@to read from a file (e.g.--sql @query.sql). - All commands return a task ID; use
swarm status/swarm resultto track progress.
Crawl (crawl)
Recursive website crawling from a seed URL.
Command overview
| Command | Purpose | Backend endpoint |
|---|---|---|
crawl <url> |
Crawl a website from a seed URL | POST /api/crawl |
| (poll) | Track crawl progress | GET /api/crawl/{id}/result |
Basic crawling
# depth=1: extract all links from homepage, load each linked page
browser4-cli crawl "https://example.com" --out-link-selector "a[href]"
# depth=2: follow links two levels deep, filter by pattern
browser4-cli crawl "https://shop.example.com" \
--depth 2 \
--out-link-selector "a.product-link" \
--out-link-pattern "/product/" \
--top-links 10
# with LoadOptions passthrough
browser4-cli crawl "https://example.com" -ol "a[href]" -a "-refresh -nMaxRetry 5"Key flags
| Flag | Default | Description |
|---|---|---|
-d, --depth |
1 |
Maximum crawl depth |
-ol, --out-link-selector |
— | CSS selector to extract links from each page |
-olp, --out-link-pattern |
.+ |
Regex pattern to filter extracted links |
-tl, --top-links |
20 |
Maximum links to extract per page |
-a, --args |
— | Additional LoadOptions passthrough |
--refresh |
— | Force a fresh fetch, ignoring cache |
--parse |
— | Parse each page immediately after fetching |
--expires |
— | Cache expiration duration |
--store-content |
— | Persist page content to storage |
--page-load-timeout |
— | Maximum time to wait for page load |
--readonly |
— | Non-destructive mode |
Notes
- All LoadOptions flags available via
--argspassthrough (e.g.-a "-nMaxRetry 5 -lazyFlush"). - Depth=1 reuses
PulsarSession.submitForOutPages; depth>1 uses BFS continuous crawl with visited-URL dedup. - CLI timeout: 600s default, configurable via
BROWSER4_CLI_CRAWL_TIMEOUT_SECS. - Duplicate URLs are skipped (normalized: lowercase, no trailing slash, no query string).
DOM Snapshot commands (domsnapshot <subcommand>)
The domsnapshot commands capture and query a static DOM snapshot of the
current page. Unlike the real-time accessibility snapshot, a DOM snapshot is
a full HTML capture that can be queried repeatedly with CSS selectors and X-SQL
without re-fetching the page.
Use the spaced domsnapshot <subcommand> form:
browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot query --sql "SELECT dom_first_text(dom, '#title') AS title FROM dom(dom)"
browser4-cli domsnapshot export --file=page.htmlCommand overview
| Command | Description |
|---|---|
domsnapshot |
Capture a static DOM snapshot of the current page |
domsnapshot get <field> [selector] [name] |
Extract elements from the static DOM snapshot |
domsnapshot query [url] --sql=<query> |
Run X-SQL against the DOM snapshot via the scrape API |
domsnapshot export [--file=<path>] |
Save full snapshot HTML content to a local file |
domsnapshot grep [OPTIONS] <pattern> |
Search snapshot HTML with regex and grep-style output |
domsnapshot grep flags
Line numbers are shown by default (unlike GNU grep). Use --no-line-number to suppress them.
| Flag | Description |
|---|---|
-i |
Case-insensitive matching |
-A N, -B N, -C N |
Context lines after / before / around each match |
-v |
Invert match (select non-matching lines) |
-c |
Print only count of matching lines |
-l |
Print only whether matches exist (prints "domsnapshot" if matches found; use with | grep -q domsnapshot for CI pass/fail) |
-F |
Treat pattern as a literal string |
-w |
Match whole words only |
--no-line-number |
Suppress line numbers in output |
--selector <CSS> |
Scope search to a CSS selector |
# Basic search
browser4-cli domsnapshot grep -i error
# With context and literal match
browser4-cli domsnapshot grep -F -C 2 "404 Not Found"
# Count matches
browser4-cli domsnapshot grep -c "div"
# Search within a specific element
browser4-cli domsnapshot grep --selector main "Submit"domsnapshot get fields
| Field | Description |
|---|---|
text |
Extract text content from the matched element |
html |
Extract inner HTML from the matched element |
attr |
Extract an attribute value (requires name, e.g. attr "#link" "href") |
selector defaults to :root when omitted. name (attribute name) is
required for the attr field.
domsnapshot query with X-SQL
# Inline X-SQL:
browser4-cli domsnapshot query "https://example.com" --sql "
SELECT
dom_base_uri(dom) AS url,
dom_first_text(dom, 'h1') AS title
FROM dom(dom);
"
# From a file:
browser4-cli domsnapshot query --sql @query.sql--sqlis required. Prefix with@to read from a file (e.g.--sql @query.sql).urlis optional and defaults to the current session's page URL.- These commands do not require an active swarm session.
Notes
domsnapshotcommands are not supported insidebatchmode.- Use
domsnapshot getfor quick element extraction; usedomsnapshot querywith X-SQL for structured multi-element extraction. - The DOM snapshot is cached in the backend until the next
domsnapshotcapture or session navigation.
Element References
The snapshot command returns an accessibility tree where every interactive
node is labeled with a short identifier such as e15. Pass this identifier
directly to commands like click, type, or press. You can also use plain
CSS selectors (e.g. .my-button, #search-input).
State Persistence
browser4-cli persists CLI state between invocations under ~/.browser4 by
default. Override the root directory with the BROWSER4_CLI_STATE_DIR
environment variable.
- Default session state:
~/.browser4/cli-state.json - Named session state (
-s=<name>):~/.browser4/sessions/<name>.json
Each state file stores the current Browser4 server URL plus session-scoped fields such as:
sessionId— active Browser4 session IDbaseUrl— Browser4 backend URL used by the CLIactiveSelector— last selector tracked for keyboard restore flowslastMousePosition— last pointer coordinates tracked for mouse restore flows
Session state transitions
| Command | Behavior |
|---|---|
open |
Creates a new session, or reuses an existing active one. Stale sessions are automatically refreshed. |
open -s=<name> |
Same as open but scoped to a named session slot. |
goto <url> |
Reuses the current session if active; otherwise opens a fresh one before navigating. |
close |
Closes the current session (no-op if none active). |
close-all / kill-all / stop |
Clears all persisted session state. |
The list command shows each session's status: Active (backend confirms),
Stale (backend has stopped it), or Unknown (backend unreachable).
Runtime Temp Files
browser4-cli keeps ephemeral runtime artifacts under the system temp directory:
- Windows:
%TEMP%\browser4\browser4-cli - Linux/macOS:
${TMPDIR:-/tmp}/browser4/browser4-cli
This temp subtree contains items such as:
- startup logs for auto-started Browser4 servers
- staged Maven wrapper launchers
- Rust test scratch directories used by
browser4-clitests
Persistent CLI state remains under ~/.browser4 by default. The Browser4 runtime
bundle (JRE, JARs, launchers) is stored separately in a platform-conventional
data directory so that clearing CLI session state does not require re-downloading
the ~200 MB runtime:
- Linux:
~/.local/share/browser4/runtime/<version>/ - macOS:
~/Library/Application Support/browser4/runtime/<version>/ - Windows:
%APPDATA%/browser4/runtime/<version>/
The current.tag file in the runtime/ directory records the active version.
Override the runtime data root with the BROWSER4_RUNTIME_DIR environment variable.
Downloaded archives are cached under the platform cache directory
(~/.cache/browser4/downloads/ on Linux, ~/Library/Caches/browser4/downloads/
on macOS, %LOCALAPPDATA%/browser4/downloads/ on Windows).
Snapshots
After each command that modifies browser state, the CLI automatically:
- Retrieves the current page URL and title
- Captures an accessibility snapshot
- Saves the snapshot to
.browser4-cli/snapshot/page-<timestamp>.yml - Prints the snapshot path in Markdown link format
Examples
# Open a new browser window (defaults to headed)
browser4-cli open
# Open in headed or headless mode
browser4-cli open --headed https://browser4.io
browser4-cli open --headless https://browser4.io
# Open with a persistent profile
browser4-cli open --profile=~/.browser4/profiles/work
# Open with a temporary (single-use) profile
browser4-cli open --profile-mode=temporary https://browser4.io
# Open with interact-level control (FASTEST / FAST / DEFAULT)
browser4-cli open --interact-level=FAST https://browser4.io
# Navigate to a page — auto-opens a session if none is active
browser4-cli goto https://browser4.io
# Inspect the page — note the eN labels on interactive nodes
browser4-cli snapshot
# Include element bounding boxes in the snapshot
browser4-cli snapshot --boxes
# Show only interactive elements with compact output, depth-limited
browser4-cli snapshot -i -c -d 5
# Scope snapshot to a specific CSS selector
browser4-cli snapshot -s "#main-content"
# Raw output — no page info, just the snapshot content (useful for piping)
browser4-cli snapshot --raw | grep "button"
# Capture a static DOM snapshot and extract data
browser4-cli domsnapshot
browser4-cli domsnapshot get text "#productTitle"
browser4-cli domsnapshot get attr "#productTitle" "href"
browser4-cli domsnapshot export --file=page.html
# Interact using refs from the snapshot
browser4-cli click e15
browser4-cli type "Hello World" e15
browser4-cli press Enter e15
browser4-cli eval "document.title"
browser4-cli eval "element => element.textContent" e15
browser4-cli keydown Shift
browser4-cli mousemove 150 300
browser4-cli mousewheel 0 100
browser4-cli keyup Shift
# Take a screenshot and save it to disk
browser4-cli screenshot
# Take a full-page screenshot
browser4-cli screenshot --full-page
# Save the current page as a PDF
browser4-cli pdf --filename=page.pdf
# Inspect tab indices before switching tabs
browser4-cli tab-list
browser4-cli tab-select 1
browser4-cli tab-close 1
# Use a custom server URL
browser4-cli open --server http://localhost:9090
# Advanced: execute multiple commands in one process (batch mode)
# Batch mode only supports DOM operations. You must run `open` separately first.
browser4-cli open
browser4-cli batch "goto https://browser4.io" "snapshot"
# Advanced: stop on the first batch failure
browser4-cli batch --bail "goto https://browser4.io" "click e1" "screenshot"
# Advanced: batch mode for form filling (recommended use case)
browser4-cli batch "fill e1 'John Doe'" "fill e2 'john@example.com'" "click e3"
# Advanced: pipe batch commands as JSON via stdin
echo '[
["goto", "https://example.com/form-filling"],
["click", "#reset-btn"],
["fill", "#first-name", "Bob"],
["fill", "#last-name", "Smith"],
["fill", "#email", "bob@example.com"],
["select", "#country", "uk"],
["check", "#agree-terms"],
["click", "#submit-btn"]
]' | browser4-cli batch --json
# Close the session when done
browser4-cli close
# Close all sessions but keep the current Browser4 backend running
browser4-cli close-all
# Explicitly stop the Browser4 backend and clean up tracked Browser4 processes
browser4-cli kill-allArchitecture
The Rust CLI is structured as follows:
| Module | Purpose |
|---|---|
main.rs |
Entry point, command dispatch, session management |
args.rs |
CLI argument parsing (global flags, positional args, options) |
commands.rs |
Command definitions mapping to MCP tool names and parameters |
http.rs |
HTTP client for calling /mcp/call-tool |
state.rs |
Persistent state management for the default state file and named session files under ~/.browser4/ |
daemon.rs |
Local server auto-start (prefer Maven from repo root, fall back to jar) and health checking |
managed_processes.rs |
Registry for browser4 server processes |
snapshot.rs |
Snapshot and screenshot file helpers |
help.rs |
Help text generation |
Testing
## Run all tests (unit + end-to-end):
cargo test
## Run only the end-to-end tests and print their output:
cargo test --test e2e -- --nocapture
## Run a specific end-to-end test scenario:
cargo test --test e2e -- --nocapture --scenario=test_e2e_batch_form_submissionPublishing the CLI package
For maintainers, the CLI package now uses an npm version guard before publish.
The GitHub release workflow publishes the npm package via npm trusted publishing
(GitHub Actions OIDC) instead of NODE_AUTH_TOKEN. This avoids CI failures caused
by npm one-time-password challenges (EOTP).
- Local release entrypoint:
npm run release - Direct guarded publish entrypoint:
npm run publish:if-needed - GitHub release workflow: re-checks npm immediately before the publish step
If the local version in cli/package.json already matches the version currently
published on npm, the publish step is skipped automatically.
Examples:
# Check whether npm publish is needed
node scripts/check-npm-publish-needed.js --json
# Publish only when the local version differs from npm
npm run publish:if-needed
# Standard maintainer release command (also guarded)
npm run releaseFor local testing, you can override the detected remote version:
BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/check-npm-publish-needed.js --json
BROWSER4_CLI_NPM_REMOTE_VERSION=0.1.7 node scripts/publish-if-needed.js --dry-runLicense
Apache-2.0