A Node.js vision-model toolkit: detection/tracking with zone triggers, visual question answering, captioning, and image embeddings (joint and vision-only), importable directly with no HTTP required.