Bitmap-frame context compression for vision-capable LLMs
Bitmap-frame context compression for vision-capable LLMs
Sogni SDK - AI image, video & audio generation plus LLM chat with vision via the Sogni Supernet (Stable Diffusion, Flux, WAN 2.2, LTX-2, Seedance, Qwen VLM)
Give every opencode model multimodal capabilities by routing attachments to a fallback multimodal model. Configure everything via the /multimodal command.
Bitmap-frame context compression for vision-capable LLMs
Pi package for agent-assisted electronics/PCB image inventory with Cloudflare Workers AI vision and datasheet enrichment.
Claude Code skill: analyze images and videos via Fmode API vision models (api.fmode.cn). Single-pass and multi-pass focused analysis with structured JSON output. Auto-reads token from FMODE_API_TOKEN, ~/.fmode/config.json, project .fmode/config.json, or A
Turn any image into agent-friendly JSON — local macOS OCR & image understanding via Apple's Vision framework. No model, no uploads, no per-call cost.
Four-source epistemic dimensional indexing for the Medicine Wheel Developer Suite — Land, Dream, Code, Vision traversal with cross-dimensional mapping and spiral depth metrics
Pi Agent extension that adds a describe_image tool, letting non-multimodal models delegate image analysis to a vision-capable model (like Qwen VL)
Vision analysis CLI + MCP server backed by Seed 2.0 via Volcano Ark or any OpenAI-compatible endpoint
Universal MCP server giving any MCP client (Claude Desktop, claude.ai, Cursor, Cline, custom apps) native access to Replicate's full catalog: image, video, audio, music, speech, LLM, vision, upscale, inpaint, segment, transcription, embeddings, voice clon
Permissionless communication supercharger MCP server — 40+ Lightning-paid tools: AI phone calls in any language, voice in 602 languages, translation across 119, fax, SMS, transcription, audiobooks, and more. No signup, no API keys, no KYC.
High-performance TensorFlow Lite library for React Native
Vision CLI - Command-line interface for vision analysis with profile-based configuration
Pi extension: Z.AI GLM-4.6V vision tools — image analysis, OCR, error diagnosis, diagram reading, UI diff, UI-to-code, video analysis
n8n community node for SiliconFlow (硅基流动). Zero runtime dependencies. Provides a SiliconFlow action node (Chat / Vision / Embeddings / Image / Rerank / Audio TTS+ASR / Video) and a LangChain-compatible Chat Model node for AI Agents. Installs cleanly witho
CLI for Z.AI capabilities: vision analysis, web search, web reader, and GitHub repo exploration. Patched fork with socket-leak, timeout, retry, and count fixes.
n8n nodes for Zihin AI - Chat Model with Tool Calling, Image Analysis, Audio Transcription, Document Parsing
Deterministic UI extraction, screenshot diffing, and design-language/style transfer for AI coding agents