An easy way to run AI models in React Native with ExecuTorch
音字 — realtime speech ⇄ text. Node/Bun bindings for the otoji Rust crate (SenseVoice ASR + multilingual polish + TTS).
音字 — realtime speech ⇄ text. Node/Bun bindings for the otoji Rust crate (SenseVoice ASR + multilingual polish + TTS).
Official JavaScript SDK for Transcribe API.
React Native SDK for Deepgram's AI-powered speech-to-text, real-time transcription, and text intelligence APIs. Supports live audio streaming, file transcription, sentiment analysis, and topic detection for iOS and Android.
Runs voice conversations with live transcription, chat orchestration, and spoken replies.
Browser SDK for SpekoAI — real-time voice conversations
Plasius AI functions providing chatbot, text-to-speech, speech-to-text, and AI-generated images and videos
n8n node for AssemblyAI speech-to-text transcription models.
Official Speko TypeScript SDK — one API, every voice provider
Official JavaScript/TypeScript SDK for WIIL Platform - AI-powered conversational services for intelligent customer interactions, voice processing, real-time translation, and business management
RunAPI ElevenLabs MCP server for audio generation: create tasks, poll results, and check pricing across 6 model variants from Claude Code, Codex, Cursor, and VS Code.
React hook for Cheetah Web SDK
Local stdio bridge for the hosted SpekoAI MCP server
PI extension for push-to-talk speech-to-text using the ElevenLabs Scribe API
Permissionless communication supercharger MCP server — 40+ Lightning-paid tools: AI phone calls in any language, voice in 602 languages, translation across 119, fax, SMS, transcription, audiobooks, and more. No signup, no API keys, no KYC.
Picovoice Cheetah Node.js binding
CLI tool to transcribe audio/video files to SRT format using OpenAI Whisper API
LiveKit adapter for Speko — run STT/LLM/TTS routing inside a LiveKit agent worker
Generic word-error-rate evaluation package
JavaScript Web API for Text-to-Speech and Speech-to-Text.
n8n community node for SiliconFlow (硅基流动). Zero runtime dependencies. Provides a SiliconFlow action node (Chat / Vision / Embeddings / Image / Rerank / Audio TTS+ASR / Video) and a LangChain-compatible Chat Model node for AI Agents. Installs cleanly witho
Local MCP server for the Voice API (Chatterbox TTS + Whisper STT). Runs on your machine, reads local audio files, and streams them to the HTTP API — so large files never pass through the model's context.
Self-hostable, OpenAI-compatible Whisper speech-to-text API server you can run anywhere with npx. Local inference via whisper.cpp or ONNX (transformers.js).