npm.io
1.2.5 • Published 3d agoCLI

pixverse-ai-cli

Licence
MIT
Version
1.2.5
Deps
8
Size
712 kB
Vulns
0
Weekly
479
Stars
54

PixVerse CLI

The official command-line interface (CLI) for PixVerse — create AI-powered videos, images, and audio directly from your terminal.

What is PixVerse?

PixVerse is an AI-powered creative platform that generates high-quality videos, images, and audio from text prompts or reference images. It supports a wide range of creative workflows including text-to-video, image-to-video, text-to-image, video transitions, text-to-speech (voice synthesis), music generation, templates/effects, and more.

What is PixVerse CLI?

PixVerse CLI is essentially a UI-free version of the PixVerse website. All features and capabilities are aligned with the web experience — if you can do it on pixverse.ai, you can do it from the command line with the same models, parameters, and quality.

It is designed for:

  • AI agents — structured JSON output, deterministic exit codes, and pipeable commands make it a perfect tool for autonomous workflows (e.g. Claude Code, Cursor, Codex, LangChain, custom agents).
  • Developers & power users — scriptable video/image/audio generation without leaving the terminal.
  • Automation — integrate AI content generation into CI/CD pipelines, batch processing scripts, or content production workflows.

Subscription Required

PixVerse CLI uses the same credit system as the website — generating videos, images, and audio consumes credits from your PixVerse account balance with the same pricing. To prevent abuse, PixVerse CLI is currently available to subscribed users only. For details on subscription plans and member benefits, see the PixVerse Subscribe page.

Installation

npm install -g pixverse

Or run without installing:

npx pixverse

Requirements: Node.js >= 20

Authentication

PixVerse CLI uses OAuth device flow — no need to manually copy tokens:

pixverse auth login

This opens a browser where you confirm the authorization. You can also copy the URL and authorize from any browser on any device — useful for SSH or headless environments. The CLI receives a token automatically and stores it locally.

  • Token is valid for 30 days
  • CLI sessions are independent from your web/app sessions
  • Run pixverse auth status to check your login state and credits
  • Run pixverse auth logout to remove the stored token

You need a PixVerse account to use the CLI. Sign up at pixverse.ai if you don't have one.

Supported Models

Video Models (--model <value>)
Model --model value Quality Duration Aspect Ratio
PixVerse V6 (default) v6 360p 540p 720p 1080p 115s 16:9 4:3 1:1 3:4 9:16 3:2 2:3 21:9
PixVerse C1 pixverse-c1 360p 540p 720p 1080p 115s 16:9 4:3 1:1 3:4 9:16 3:2 2:3
Seedance 2.0 Standard seedance-2.0-standard 480p 720p 1080p 2160p 415s 16:9 4:3 1:1 3:4 9:16 21:9
Seedance 2.0 Fast seedance-2.0-fast 480p 720p 415s 16:9 4:3 1:1 3:4 9:16 21:9
Seedance 2.0 Mini seedance-2.0-mini 480p 720p 415s 16:9 4:3 1:1 3:4 9:16 21:9
Happy Horse 1.0 happyhorse-1.0 720p 1080p 315s 16:9 9:16 1:1 4:3 3:4
Kling O3 Pro kling-o3-pro 720p 315s 16:9 9:16 1:1
Kling O3 Standard kling-o3-standard 720p 315s 16:9 9:16 1:1
Kling 3.0 Pro kling-3.0-pro 720p 315s 16:9 9:16 1:1
Kling 3.0 Standard kling-3.0-standard 720p 315s 16:9 9:16 1:1
Grok Imagine 1.5 grok-imagine-1.5 480p 720p 115s from image
Grok Imagine grok-imagine 480p 720p 115s 16:9 4:3 1:1 9:16 3:4 3:2 2:3
Veo 3.1 Lite veo-3.1-lite 720p 1080p 4 6 8s 16:9 9:16
Veo 3.1 Standard veo-3.1-standard 720p 1080p 2160p 4 6 8s 16:9 9:16
Veo 3.1 Fast veo-3.1-fast 720p 1080p 2160p 4 6 8s 16:9 9:16
Sora 2 Pro sora-2-pro 720p 1080p 4 8 12s 16:9 9:16
Sora 2 sora-2 720p 4 8 12s 16:9 9:16
PixVerse v5.6 v5.6 360p 480p 540p 720p 1080p 110s 16:9 4:3 1:1 3:4 9:16 3:2 2:3
PixVerse v5.5 v5.5 360p 480p 540p 720p 1080p 110s 16:9 4:3 1:1 3:4 9:16 3:2 2:3
PixVerse v5 v5 360p 480p 540p 720p 1080p 110s 16:9 4:3 1:1 3:4 9:16 3:2 2:3

Grok Imagine 1.5 is image-to-video only — it requires --image and derives its aspect ratio from the input image (the --aspect-ratio flag is ignored).

Not all models support all creation modes. See the per-mode support matrix below.

Per-mode Model Support
Creation mode Supported --model values
create video (text-to-video / image-to-video) v6 pixverse-c1 seedance-2.0-standard seedance-2.0-fast seedance-2.0-mini happyhorse-1.0 kling-o3-pro kling-o3-standard kling-3.0-pro kling-3.0-standard grok-imagine-1.5 grok-imagine veo-3.1-lite veo-3.1-standard veo-3.1-fast sora-2-pro sora-2 v5.6
create extend v6 grok-imagine
create reference (multi-subject reference) v6 pixverse-c1 seedance-2.0-standard seedance-2.0-fast seedance-2.0-mini kling-o3-pro kling-o3-standard grok-imagine v5.6
create transition (2 frames) v6 pixverse-c1 seedance-2.0-standard seedance-2.0-fast seedance-2.0-mini kling-o3-pro kling-o3-standard kling-3.0-pro kling-3.0-standard veo-3.1-lite veo-3.1-standard veo-3.1-fast v5.6
create transition (3+ frames) v5
create modify v5.5
create motion-control v5.6

Audio creation uses separate model families: create voice for text-to-speech and create music for prompt-to-music.

Image Models (--model <value>)
Model --model value Quality Aspect Ratio
GPT Image 2 (default) gpt-image-2.0 1080p 1440p 2160p 1:1 16:9 9:16 4:3 3:4 3:2 2:3 2:1 1:2 21:9
Nano Banana 2 gemini-3.1-flash 512p 1080p 1440p 2160p auto 1:1 16:9 9:16 + more
Qwen-image qwen-image 720p 1080p 1:1 16:9 9:16 4:3 3:4 5:4 4:5 3:2 2:3 21:9
Nano Banana Pro gemini-3.0 1080p 1440p 2160p auto 1:1 16:9 9:16 + more
Nano Banana gemini-2.5-flash 1080p auto 1:1 16:9 9:16 + more
Seedream 5.0 Lite seedream-5.0-lite 1440p 1800p 2160p auto 1:1 16:9 9:16 + more
Seedream 4.5 seedream-4.5 1440p 2160p auto 1:1 16:9 9:16 + more
Seedream 4.0 seedream-4.0 1080p 1440p 2160p auto 1:1 16:9 9:16 + more
Kling Image O3 kling-image-o3 1080p 1440p 2160p 16:9 9:16 1:1 + more
Kling Image V3 kling-image-v3 1080p 1440p 16:9 9:16 1:1 + more
Voice / TTS Models (create voice --model <value>)
Model --model value Provider Max characters
MiniMax Speech 2.8 HD (default) speech-2.8-hd MiniMax 10,000
MiniMax Speech 2.8 Turbo speech-2.8-turbo MiniMax 10,000
Eleven Multilingual v2 eleven-multilingual-v2 ElevenLabs 10,000
Eleven v3 eleven-v3 ElevenLabs 5,000
Eleven Turbo v2.5 eleven-turbo-v2.5 ElevenLabs 40,000

Browse available preset voices with pixverse voice presets --model <id> and the full live model catalog with pixverse voice models.

Music Models (create music --model <value>)
Model --model value Provider Duration Notes
MiniMax Music 2.6 (default) music-2.6 MiniMax 10-240s Lyrics, auto lyrics, instrumental
ElevenLabs Music music-v1 ElevenLabs 10-240s Lyrics, auto lyrics, instrumental
Google Lyria 3 Pro lyria-3-pro-preview Google 10-240s Image references, no separate --lyrics

Browse the live music model catalog with pixverse music models.


Usage

Interactive Mode

Run any creation command without arguments to enter the interactive wizard:

pixverse create video
pixverse create image

The wizard guides you through prompt, model, quality, aspect ratio, and other options step by step.

Local image inputs larger than 1920x1920 or 5MB are automatically resized/compressed before upload. Remote image URLs are validated by the backend as-is.

Text to Video
pixverse create video --prompt "A cat walking on Mars" --model v6 --quality 720p --aspect-ratio 16:9
Text inputs: literal, a file, or stdin

Text-input flags — --prompt (all create commands), --text (create voice), and --lyrics (create music) — accept three forms, just like --image / --video:

  • a literal string: --prompt "A neon city skyline"
  • a local file path: --prompt ./scene.txt (the file's contents are used)
  • - to read from stdin: ... | pixverse create video --prompt -
pixverse create video --prompt ./scene.txt
cat scene.txt | pixverse create image --prompt - --json
echo "Hello from the command line" | pixverse create voice --text -
pixverse create music --prompt "Bright synth-pop" --lyrics ./lyrics.txt

A value is treated as a file only when a matching file actually exists on disk; otherwise it's used as literal text (the same rule as --image / --video).

Image to Video
pixverse create video --prompt "Slow zoom in" --image ./photo.png
Text to Image
pixverse create image --prompt "Cyberpunk cityscape at night" --aspect-ratio 16:9
Image to Image
pixverse create image --prompt "Turn this into a watercolor painting" --image ./photo.png
Other Creation Modes
# Create a transition between keyframes (requires 2+ images)
pixverse create transition --images ./frame1.png ./frame2.png ./frame3.png

# Generate speech audio from text (text-to-speech)
pixverse create voice --text "Hello world" --voice-id <preset_voice_id> --output ./out.mp3
# Browse available models / preset voices:
pixverse voice models
pixverse voice presets --model speech-2.8-hd

# Generate music audio from a prompt
pixverse create music --prompt "A cinematic pop song with bright synths" --auto-lyrics
pixverse create music --prompt "Uplifting piano theme" --instrumental --duration-seconds 60
# Lyrics-capable models require lyrics unless --auto-lyrics or --instrumental is used:
# (--lyrics takes a literal string, a local file path, or - for stdin)
pixverse create music --prompt "Bright synth-pop, uplifting mood" --lyrics ./lyrics.txt
# Google Lyria supports image references and expects lyric-like instructions in --prompt:
pixverse create music -m lyria-3-pro-preview --prompt "Instrumental orchestral cue inspired by these images" --image ./moodboard.png
# Browse available music models:
pixverse music models

# Extend video duration
pixverse create extend --video <video_id>

# Modify an existing video
pixverse create modify --video <video_id> --prompt "Change the background to a beach"

# Upscale video resolution
pixverse create upscale --video <video_id> --quality 1080p

# Generate video with character reference (1–7 images)
pixverse create reference --images ./char1.png ./char2.png --prompt "Two friends walking in a park"

# Seedance 2.0 reference — mix images and videos (max 3 videos, total ≤ 15s)
pixverse create reference -m seedance-2.0-standard --images ./char.png --videos ./motion.mp4 --prompt "@image1 follows the motion in @video1"

# Seedance 2.0 reference — add audio references (max 3, each 2–15s, total ≤ 15s; needs a visual reference)
pixverse create reference -m seedance-2.0-standard --images ./char.png --audios ./voice.mp3 --prompt "@image1 speaks the line in @audio1"

# Motion control — character image + motion reference video
pixverse create motion-control --image ./character.png --video ./dance.mp4

# Create from a template/effect
pixverse create template --template-id 12345 --image ./photo.png

Voice speed uses provider-specific validation:

Provider Default Valid range Invalid range error Provider request field
ElevenLabs 1.0 0.7..1.2 --speed must be between 0.7 and 1.2 voice_settings.speed
MiniMax 1.0 0.5..2.0 --speed must be between 0.5 and 2 voice_setting.speed
Common Creation Flags

These flags are available across most create subcommands:

Flag Description
--count <n> Generate multiple variations (1–4, default 1)
--seed <number> Set random seed for reproducible results
--off-peak Use off-peak pricing (lower credit cost)
--audio / --no-audio Enable or disable audio generation
--multi-shot / --no-multi-shot Enable or disable multi-shot mode (video only)
--no-wait Return immediately without waiting for completion
--timeout <sec> Polling timeout in seconds (default 300)
Task Management
# Check task status
pixverse task status <id>

# Poll a voice/music audio task (audio is not auto-detected — pass --type audio)
pixverse task status <id> --type audio

# Batch status query (parallel; per-ID failures captured in the response map)
pixverse task status --ids 123,456,789 --type video --json

# Wait for a task to complete
pixverse task wait <id>
Asset Management
# List your generated assets (default: created videos)
pixverse asset list
pixverse asset list --type image
pixverse asset list --type audio              # voice and music audio history
pixverse asset list --type audio --source upload
pixverse asset list --source upload
pixverse asset list --source create --off-peak

# Upload a local file or URL to asset library
pixverse asset upload ./photo.png
pixverse asset upload ./voice-over.mp3
pixverse asset upload https://example.com/image.jpg

# Get asset details (type auto-detected: video → image → audio)
pixverse asset info <id>
# Pass --type to skip auto-detection
pixverse asset info <id> --type audio
pixverse asset info <id> --type audio --source upload

# Download a created video, image, or audio (uploads are not downloadable)
pixverse asset download <id>
pixverse asset download <id> --type audio --dest ./out/

# Delete a created asset — pass its id (auto-detected)
pixverse asset delete <id>
pixverse asset delete <id> --type audio

# Delete an uploaded asset — pass the id from `asset list --source upload`
pixverse asset delete <id> --source upload --type image
Saved Folders
# List all saved folders
pixverse saved list

# List items in a folder (default folder if omitted)
pixverse saved items
pixverse saved items <folder_id> --type image --source upload

# Create a new folder
pixverse saved new "My Collection"

# Rename a folder
pixverse saved rename <folder_id> "New Name"

# Add assets to a folder
pixverse saved add <asset_id...> --folder <folder_id> --type video

# Remove assets from a folder
pixverse saved remove <asset_id...> --folder <folder_id> --type video

# Delete a folder
pixverse saved delete <folder_id>
Templates
# List template categories
pixverse template categories

# List templates (with optional category filter and pagination)
pixverse template list
pixverse template list --category 5 --page 2 --limit 10

# Search templates by keyword
pixverse template search "dance"

# Get template details
pixverse template info <template_id>
Workspaces
# List all workspaces
pixverse workspace list

# Show current workspace
pixverse workspace status

# Switch workspace (interactive or by ID)
pixverse workspace switch
pixverse workspace switch <workspace_id>

# Open workspace management in browser
pixverse workspace manage
Account & Subscription
# View account info and credits
pixverse account info
pixverse account usage

# View current concurrent generation slots (image / video)
pixverse account slots
pixverse account slots --json

# Open subscription page in browser
pixverse subscribe
Keeping the CLI up to date
# Update to the latest published version
pixverse update

When run interactively, the CLI checks the npm registry at most once per day and prints a one-line "update available" notice to stderr (never to stdout, so --json output stays clean). The check is skipped in --json/-p mode, in CI, and when stdout/stderr is piped.

Configuration
# Set output directory
pixverse config set output-dir ~/Downloads

# View current configuration
pixverse config list

# Show config file path
pixverse config path

# Set per-mode creation defaults (model, quality, duration, etc.)
pixverse config defaults set video model v6
pixverse config defaults set video quality 1080p
pixverse config defaults show

JSON Output for Scripts & Agents

All commands support --json (or -p) for structured JSON output, making the CLI easy to integrate into automated workflows:

pixverse create video --prompt "A sunset over the ocean" --json
pixverse task wait <id> --json
pixverse account info --json
Pipeline Example
# Create a video → wait for completion → download
VID=$(pixverse create video --prompt "A cat on the moon" --json | jq -r '.video_id')
pixverse task wait "$VID" --json
pixverse asset download "$VID" --dest ./output/
Exit Codes
Code Meaning
0 Success
1 General error
2 Timeout
3 Authentication error
4 Credit / subscription limit
5 Generation failed
6 Validation error

All Commands

Command Description
auth login Login via browser (OAuth device flow)
auth status Check authentication status
auth logout Remove stored token
create video Text-to-video or image-to-video
create image Text-to-image or image-to-image
create transition Create transitions between keyframes
create voice Generate speech audio from text (text-to-speech)
create music Generate music audio from a prompt
create extend Extend video duration
create modify Modify an existing video
create upscale Upscale video resolution
create reference Generate video with character references
create motion-control Motion control with character image + reference video
create template Create from a template/effect
template categories List template categories
template list List templates (with category filter)
template search Search templates by keyword
template info Get template details
voice models List voice/TTS providers, models, and supported languages
voice presets List preset voices (filterable by model / language / provider)
music models List music providers, models, and capabilities
task status Check task status (single <id> or --ids id1,id2,... for batch)
task wait Wait for task completion
asset list List assets (--source create|upload, --type video|image|audio, --off-peak)
asset upload Upload a local file or HTTPS URL to asset library
asset info Get asset details
asset download Download a generated asset
asset delete Delete an asset
saved list List saved folders
saved items List items in a saved folder
saved new Create a new saved folder
saved rename Rename a saved folder
saved add Add assets to a saved folder
saved remove Remove assets from a saved folder
saved delete Delete a saved folder
workspace list List all workspaces
workspace status Show current workspace
workspace switch Switch workspace (interactive or by ID)
workspace manage Open workspace management in browser
account info View account info and workspace credits
account usage View credit usage
account slots View current concurrent generation slots (image / video)
subscribe Open subscription page
update Update the CLI to the latest version (npm i -g pixverse@latest)
config set Set a config value
config get Get a config value
config list List all config values
config reset Reset config to defaults
config path Show config file path
config defaults Manage per-mode creation defaults

Global Flags

Flag Description
--json Output as JSON
-p Print mode (alias for --json)
--workspace-id <id> Override active workspace for this command (0 = personal)
-V, --version Show CLI version
-h, --help Show help for any command

For AI Agents — Advanced Usage

For AI agents (Claude Code, Cursor, Codex, etc.), we strongly recommend installing PixVerse Skills — a comprehensive skill library that teaches agents how to use PixVerse CLI correctly with full model constraints, multi-step pipelines, and error handling.

For lightweight discovery, the public repo also includes a compact machine-readable command manifest at capabilities.json; the npm package includes the same file at dist/capabilities.json.

Install via Skills CLI:

npx skills add https://github.com/pixverseai/skills --skill pixverse-ai-image-and-video-generator

Or browse on ClawHub:

https://clawhub.ai/pixverse-official/pixverse-ai-image-and-video-generator

Skills include:

  • Per-model parameter constraints (which models support which modes, quality levels, durations, aspect ratios)
  • End-to-end workflow pipelines (text-to-video, storyboard-to-video, video production, motion control, etc.)
  • Prompt optimization techniques for better generation quality
  • Batch creation patterns and error handling strategies

License

MIT

Keywords