Licence
MIT
Version
0.1.2
Deps
0
Size
87 kB
Vulns
0
Weekly
0
pi-zai-vision
Pi extension: Z.AI GLM-4.6V vision tools — native pi integration of the official Z.AI vision MCP server capabilities.
Tools
General Vision
| Tool | Description |
|---|---|
image_analysis |
General-purpose image analysis — describe, identify, extract, answer |
Specialized Image Tools
| Tool | Description |
|---|---|
extract_text_from_screenshot |
OCR text extraction with formatting preservation |
diagnose_error_screenshot |
Error diagnosis with root cause analysis and fix guidance |
understand_technical_diagram |
Architecture/UML/flowchart/ER diagram interpretation |
analyze_data_visualization |
Chart/graph/dashboard analysis — trends, anomalies, recommendations |
ui_diff_check |
Compare two UI screenshots — visual regression testing |
ui_to_artifact |
UI screenshot → code (HTML/CSS), spec, prompt, or description |
Video
| Tool | Description |
|---|---|
video_analysis |
Video content analysis — scenes, actions, events (MP4/MOV/M4V/AVI/WebM/WMV) |
Prerequisites
- A Zhipu / Z.AI API key with GLM-4.6V access
- Pi installed (
npm install -g @ghoulm370/pi-web)
Quick Start
1. Install
# From local path (development)
pi install /path/to/pi-zai-vision
# From npm (once published)
pi install npm:@ghoulm370/pi-zai-vision2. Configure
方式一:配置文件(推荐,~/.pi/agent/pi-zai-vision.json)
{
"apiKey": "your-api-key",
"mode": "ZAI"
}方式二:环境变量(优先级更高)
export Z_AI_API_KEY="your-api-key"
export Z_AI_MODE="ZAI" # ZHIPU | ZHIPU_CODING | ZAI | ZAI_CODING优先级:环境变量 > 配置文件 > 默认值
| 配置项 | 环境变量 | 配置文件 key | 默认值 |
|---|---|---|---|
| API Key | Z_AI_API_KEY |
apiKey |
(required) |
| 平台 | Z_AI_MODE |
mode |
ZAI |
| 模型 | Z_AI_MODEL |
model |
glm-4.6v |
| Thinking | Z_AI_THINKING |
thinking |
true |
| 超时(秒) | Z_AI_TIMEOUT |
timeoutSecs |
300 |
| 图片上限(MB) | Z_AI_MAX_IMAGE_SIZE_MB |
maxImageSizeMB |
5 |
| 视频上限(MB) | Z_AI_MAX_VIDEO_SIZE_MB |
maxVideoSizeMB |
8 |
3. Use
piThen in your pi session:
> Describe this screenshot: image_analysis(imagePath="error.png", prompt="What's wrong?")
> Extract the code from this screenshot: extract_text_from_screenshot(imagePath="code.png", programmingLanguage="typescript")
> Diagnose this error: diagnose_error_screenshot(imagePath="stacktrace.png", context="Running cargo build")
Platform Endpoints
Coding Plan(编码套餐)用户:智谱用
ZHIPU_CODING,Z.AI 用ZAI_CODING。 Coding Plan 支持的 OpenAI Chat Completion 端点见 官方文档。
| Mode | Endpoint | 适用 |
|---|---|---|
ZHIPU |
https://open.bigmodel.cn/api/paas/v4/ |
智谱(标准 / 按量计费) |
ZHIPU_CODING |
https://open.bigmodel.cn/api/coding/paas/v4/ |
智谱 Coding Plan |
ZAI |
https://api.z.ai/api/paas/v4/ |
Z.AI(标准 / 按量计费) |
ZAI_CODING |
https://api.z.ai/api/coding/paas/v4/ |
Z.AI Coding Plan |
Architecture
This package ports the official @z_ai/mcp-server vision tools into pi-native custom tools using pi.registerTool(). System prompts are sourced directly from the official MCP server (verified via glm-vision-rs).
Unlike an MCP server (which requires a separate npx process and MCP protocol), pi-native tools:
- Appear in pi's system prompt and tool list automatically
- Support pi's native cancellation signals
- Work in all pi modes: TUI, RPC, print
- Show streaming progress via
onUpdate
Development
# Test with pi
pi -e ./extensions/index.ts
# From a local path
pi install ./pi-zai-visionLicense
MIT