npm.io
0.1.2 • Published 2h ago

@ghoulm370/pi-zai-vision

Licence
MIT
Version
0.1.2
Deps
0
Size
87 kB
Vulns
0
Weekly
0

pi-zai-vision

Pi extension: Z.AI GLM-4.6V vision tools — native pi integration of the official Z.AI vision MCP server capabilities.

Tools

General Vision

Tool Description
image_analysis General-purpose image analysis — describe, identify, extract, answer
Specialized Image Tools
Tool Description
extract_text_from_screenshot OCR text extraction with formatting preservation
diagnose_error_screenshot Error diagnosis with root cause analysis and fix guidance
understand_technical_diagram Architecture/UML/flowchart/ER diagram interpretation
analyze_data_visualization Chart/graph/dashboard analysis — trends, anomalies, recommendations
ui_diff_check Compare two UI screenshots — visual regression testing
ui_to_artifact UI screenshot → code (HTML/CSS), spec, prompt, or description
Video
Tool Description
video_analysis Video content analysis — scenes, actions, events (MP4/MOV/M4V/AVI/WebM/WMV)

Prerequisites

  • A Zhipu / Z.AI API key with GLM-4.6V access
  • Pi installed (npm install -g @ghoulm370/pi-web)

Quick Start

1. Install
# From local path (development)
pi install /path/to/pi-zai-vision

# From npm (once published)
pi install npm:@ghoulm370/pi-zai-vision
2. Configure

方式一:配置文件(推荐,~/.pi/agent/pi-zai-vision.json

{
  "apiKey": "your-api-key",
  "mode": "ZAI"
}

方式二:环境变量(优先级更高)

export Z_AI_API_KEY="your-api-key"
export Z_AI_MODE="ZAI"       # ZHIPU | ZHIPU_CODING | ZAI | ZAI_CODING

优先级:环境变量 > 配置文件 > 默认值

配置项 环境变量 配置文件 key 默认值
API Key Z_AI_API_KEY apiKey (required)
平台 Z_AI_MODE mode ZAI
模型 Z_AI_MODEL model glm-4.6v
Thinking Z_AI_THINKING thinking true
超时(秒) Z_AI_TIMEOUT timeoutSecs 300
图片上限(MB) Z_AI_MAX_IMAGE_SIZE_MB maxImageSizeMB 5
视频上限(MB) Z_AI_MAX_VIDEO_SIZE_MB maxVideoSizeMB 8
3. Use
pi

Then in your pi session:

> Describe this screenshot: image_analysis(imagePath="error.png", prompt="What's wrong?")

> Extract the code from this screenshot: extract_text_from_screenshot(imagePath="code.png", programmingLanguage="typescript")

> Diagnose this error: diagnose_error_screenshot(imagePath="stacktrace.png", context="Running cargo build")

Platform Endpoints

Coding Plan(编码套餐)用户:智谱用 ZHIPU_CODING,Z.AI 用 ZAI_CODING。 Coding Plan 支持的 OpenAI Chat Completion 端点见 官方文档

Mode Endpoint 适用
ZHIPU https://open.bigmodel.cn/api/paas/v4/ 智谱(标准 / 按量计费)
ZHIPU_CODING https://open.bigmodel.cn/api/coding/paas/v4/ 智谱 Coding Plan
ZAI https://api.z.ai/api/paas/v4/ Z.AI(标准 / 按量计费)
ZAI_CODING https://api.z.ai/api/coding/paas/v4/ Z.AI Coding Plan

Architecture

This package ports the official @z_ai/mcp-server vision tools into pi-native custom tools using pi.registerTool(). System prompts are sourced directly from the official MCP server (verified via glm-vision-rs).

Unlike an MCP server (which requires a separate npx process and MCP protocol), pi-native tools:

  • Appear in pi's system prompt and tool list automatically
  • Support pi's native cancellation signals
  • Work in all pi modes: TUI, RPC, print
  • Show streaming progress via onUpdate

Development

# Test with pi
pi -e ./extensions/index.ts

# From a local path
pi install ./pi-zai-vision

License

MIT

Keywords