@nathandevx/excoder
ExCoder
AI coding assistant for any LLM — OpenAI, Gemini, DeepSeek, Ollama, Groq, and 200+ models.
ExCoder is a powerful terminal-based AI coding assistant. Plug in GPT-4o, DeepSeek, Gemini, Llama, Mistral, Groq, or any model that speaks the OpenAI chat completions API. It also supports the ChatGPT Codex backend for codexplan and codexspark.
All tools work — bash, file read/write/edit, grep, glob, agents, tasks, MCP — just powered by whatever model you choose.
Install
Option A: npm (recommended)
npm install -g ExCoderOption B: From source (requires Bun)
# Clone the repo
git clone https://github.com/ExCoder-ai/ExCoder.git
cd ExCoder
# Install dependencies
bun install
# Build
bun run build
# Link globally (optional)
npm linkOption C: Run directly with Bun (no build step)
git clone https://github.com/ExCoder-ai/ExCoder.git
cd ExCoder
bun install
bun run devQuick Start
1. Set environment variables
OpenAI:
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=sk-your-key-here
export OPENAI_MODEL=gpt-4oGroq (OpenAI-compatible):
export EXCODER_USE_OPENAI=1
export EXCODER_PROVIDER=groq # sets base URL automatically
export GROQ_API_KEY=gsk-your-key-here
export OPENAI_MODEL=llama-3.3-70b-versatile
# Or explicitly: export OPENAI_BASE_URL=https://api.groq.com/openai/v1ScrapeGoat (OpenAI-compatible):
export EXCODER_USE_OPENAI=1
export EXCODER_PROVIDER=scrapegoat # sets base URL automatically
export SCRAPEGOAT_API_KEY=sk-your-key-here
export OPENAI_MODEL=spacelabs/scrapegoat-pro-max
# Or explicitly: export OPENAI_BASE_URL=https://scrapegoat.pro/api/v1Other OpenAI-compatible providers
| Provider | Base URL | Key env |
|---|---|---|
| ScrapeGoat | https://scrapegoat.pro/api/v1 |
SCRAPEGOAT_API_KEY |
| OpenRouter | https://openrouter.ai/api/v1 |
OPENROUTER_API_KEY |
| DeepSeek | https://api.deepseek.com/v1 |
DEEPSEEK_API_KEY |
| Together | https://api.together.xyz/v1 |
TOGETHER_API_KEY |
| Mistral | https://api.mistral.ai/v1 |
MISTRAL_API_KEY |
| Generic | any /v1 URL |
OPENAI_API_KEY or EXCODER_API_KEY |
| Local (Ollama) | http://localhost:11434/v1 |
none required |
Any OpenAI-compatible API (custom base URL):
export EXCODER_USE_OPENAI=1
export OPENAI_BASE_URL=https://openrouter.ai/api/v1 # provider /v1 URL
export OPENROUTER_API_KEY=sk-or-... # or OPENAI_API_KEY
export OPENAI_MODEL=your-model-id2. Run it
# If installed via npm
ExCoder
# If built from source
bun run dev
# or after build:
node dist/cli.mjsThat's it. The tool system, streaming, file editing, multi-step reasoning — everything works through the model you picked.
Voice (speech-to-text)
Push-to-talk uses batch transcription (not streaming). Configure one of:
- OpenAI-compatible (default):
EXCODER_STT_PROVIDER=openaiwithOPENAI_API_KEYorGROQ_API_KEY(Groq Whisper). Optional:EXCODER_STT_MODEL(e.g.whisper-1orwhisper-large-v3-turboon Groq). - Hugging Face:
EXCODER_STT_PROVIDER=huggingface,HUGGINGFACE_API_KEY, andEXCODER_STT_MODEL(defaultnvidia/parakeet-tdt-0.6b-v2). - Local / self-hosted:
EXCODER_STT_PROVIDER=localandEXCODER_STT_ENDPOINTbase URL for an OpenAI-compatible/v1/audio/transcriptionsendpoint.
Text-to-speech hooks are reserved via EXCODER_TTS_* (see src/services/voice/ttsProvider.ts).
Official MCP registry
Override the default registry URL with EXCODER_MCP_REGISTRY_URL if needed (defaults to the bundled commercial MCP catalog endpoint).
Reflex search MCP (auto-enabled)
Reflex ships a stdio MCP server (rfx mcp). ExCoder registers it automatically when the binary is found under previous_version/ or on your PATH.
Build the binary (from this repo):
cd previous_version/reflex && cargo build --releaseOr run postinstall (downloads prebuilt
rfxwhen possible):node previous_version/scripts/postinstall.mjsOptional overrides:
REFLEX_BIN— absolute path torfxif not onPATHEXCODER_DISABLE_REFLEX_MCP=1— opt out of auto-registration
Run /doctor to see whether the binary was found and whether a ./.reflex index directory exists in the current project.
RLMgw context gateway
RLMgw is a local Python sidecar that selects relevant repo files and injects them into upstream LLM requests. The MCP shim (ExCoder_rlmgw_status) is always registered; the Python gateway runs when EXCODER_USE_RLMGW=1.
# Install Python gateway (requires Python 3.11+)
cd previous_version/rlmgw && python3.11 -m pip install -e ".[gw]"
# Enable sidecar routing
export EXCODER_USE_RLMGW=1
export EXCODER_USE_OPENAI=1MCP tools: rlmgw_status, rlmgw_healthz, rlmgw_readyz, rlmgw_context_latest. Opt out of MCP shim: EXCODER_DISABLE_BUNDLED_RLMGW_MCP=1.
Kimi WebBridge browser MCP
Browser automation uses Kimi WebBridge (replaces kimi-webbridge). Install the daemon and Chrome extension:
curl -fsSL https://kimi-web-img.moonshot.cn/webbridge/install_skill.sh | bash -s -- -yWhen the Kimi WebBridge daemon is installed (~/.kimi-webbridge/bin/kimi-webbridge), ExCoder registers kimi_webbridge MCP automatically. v1.10+ daemon binaries do not include mcp — ExCoder falls back to npx -y kimi-webbridge mcp. Opt out: EXCODER_DISABLE_KIMI_WEBBRIDGE_MCP=1.
Provider Examples
OpenAI
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4oCodex via ChatGPT auth
codexplan maps to GPT-5.4 on the Codex backend with high reasoning.
codexspark maps to GPT-5.3 Codex Spark for faster loops.
If you already use the Codex CLI, ExCoder will read ~/.codex/auth.json
automatically. You can also point it elsewhere with CODEX_AUTH_JSON_PATH or
override the token directly with CODEX_API_KEY.
export EXCODER_USE_OPENAI=1
export OPENAI_MODEL=codexplan
# optional if you do not already have ~/.codex/auth.json
export CODEX_API_KEY=...
ExCoderDeepSeek
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.deepseek.com/v1
export OPENAI_MODEL=deepseek-chatGoogle Gemini (via OpenRouter)
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=sk-or-...
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
export OPENAI_MODEL=google/gemini-2.0-flashOllama (local, free)
ollama pull llama3.3:70b
export EXCODER_USE_OPENAI=1
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_MODEL=llama3.3:70b
# no API key needed for local modelsLM Studio (local)
export EXCODER_USE_OPENAI=1
export OPENAI_BASE_URL=http://localhost:1234/v1
export OPENAI_MODEL=your-model-nameTogether AI
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.together.xyz/v1
export OPENAI_MODEL=meta-llama/Llama-3.3-70B-Instruct-TurboGroq
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=gsk_...
export OPENAI_BASE_URL=https://api.groq.com/openai/v1
export OPENAI_MODEL=llama-3.3-70b-versatileMistral
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.mistral.ai/v1
export OPENAI_MODEL=mistral-large-latestAzure OpenAI
export EXCODER_USE_OPENAI=1
export OPENAI_API_KEY=your-azure-key
export OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment/v1
export OPENAI_MODEL=gpt-4oEnvironment Variables
| Variable | Required | Description |
|---|---|---|
EXCODER_USE_OPENAI |
Yes | Set to 1 to enable the OpenAI-compatible provider shim |
EXCODER_PROVIDER |
No | Preset: groq, scrapegoat, openrouter, spacelabs, deepseek, together, mistral (sets default base URL + key env) |
OPENAI_API_KEY |
Yes* | Generic API key for OpenAI-compatible endpoints |
GROQ_API_KEY |
Groq | Used when EXCODER_PROVIDER=groq or base URL contains groq.com |
SCRAPEGOAT_API_KEY |
ScrapeGoat | Used when EXCODER_PROVIDER=scrapegoat or base URL contains scrapegoat.pro |
OPENROUTER_API_KEY |
OpenRouter | Used when EXCODER_PROVIDER=openrouter or base URL contains openrouter.ai |
DEEPSEEK_API_KEY |
DeepSeek | Used when EXCODER_PROVIDER=deepseek or base URL contains deepseek.com |
TOGETHER_API_KEY |
Together | Used when EXCODER_PROVIDER=together or base URL contains together.xyz |
MISTRAL_API_KEY |
Mistral | Used when EXCODER_PROVIDER=mistral or base URL contains mistral.ai |
EXCODER_API_KEY |
No | Alias for OPENAI_API_KEY on generic OpenAI-compatible providers |
OPENAI_MODEL |
Yes | Model name (e.g. gpt-4o, deepseek-chat, llama3.3:70b) |
OPENAI_BASE_URL |
No | API endpoint (defaults to https://api.openai.com/v1) |
CODEX_API_KEY |
Codex only | Codex/ChatGPT access token override |
CODEX_AUTH_JSON_PATH |
Codex only | Path to a Codex CLI auth.json file |
CODEX_HOME |
Codex only | Alternative Codex home directory (auth.json will be read from here) |
EXCODER_DISABLE_REFLEX_MCP |
No | Set to 1 to skip auto-registration of Reflex MCP (on by default when rfx is found) |
REFLEX_BIN |
No | Absolute path to the rfx binary if it is not on PATH |
EXCODER_USE_RLMGW |
No | Set to 1 to start the RLMgw Python sidecar and route API requests through it |
EXCODER_RLMGW_DIR |
No | Path to the rlmgw Python package directory (default: previous_version/rlmgw) |
EXCODER_DISABLE_BUNDLED_RLMGW_MCP |
No | Set to 1 to skip the bundled RLMgw MCP shim |
EXCODER_DISABLE_KIMI_WEBBRIDGE_MCP |
No | Set to 1 to skip Kimi WebBridge MCP auto-registration |
KIMI_WEBBRIDGE_BIN |
No | Override path to kimi-webbridge binary (default: ~/.kimi-webbridge/bin/kimi-webbridge) |
MCP_TIMEOUT |
No | MCP connect timeout in ms (default 120000 for npx servers); use 180000 if claude-flow cold-starts slowly |
EXCODER_AUTONOMY_DIR |
No | Override path to the autonomy package root (previous_version/) |
ruv-swarm / better-sqlite3: plugin:claude-flow:ruv-swarm needs native better-sqlite3 bindings. On Node 23+, ExCoder auto-spawns it with Node 20 when found at /usr/local/opt/node@20/bin/node (override with EXCODER_MCP_NODE). If it still fails, rebuild once:
cd ruflo/v2/node_modules/.pnpm/better-sqlite3@*/node_modules/better-sqlite3
rm -rf build/node_gyp_bins build
/usr/local/opt/node@20/bin/node $(npm root -g)/npm/node_modules/node-gyp/bin/node-gyp.js rebuild --releaseOr disable ruv-swarm in /mcp (optional server).
You can also use SCRAPEGOAT_MODEL to override the model name. OPENAI_MODEL takes priority.
Runtime Hardening
Use these commands to keep the CLI stable and catch environment mistakes early:
# quick startup sanity check
bun run smoke
# validate provider env + reachability
bun run doctor:runtime
# print machine-readable runtime diagnostics
bun run doctor:runtime:json
# persist a diagnostics report to reports/doctor-runtime.json
bun run doctor:report
# full local hardening check (typecheck + smoke + runtime doctor)
bun run hardening:check
# strict hardening (includes project-wide typecheck)
bun run hardening:strictNotes:
doctor:runtimefails fast ifEXCODER_USE_OPENAI=1with a placeholder key (SUA_CHAVE) or a missing key for non-local providers.- Local providers (for example
http://localhost:11434/v1) can run withoutOPENAI_API_KEY. - Codex profiles validate
CODEX_API_KEYor the Codex CLI auth file and probePOST /responsesinstead ofGET /models.
Provider Launch Profiles
Use profile launchers to avoid repeated environment setup:
# one-time profile bootstrap (auto-detect ollama, otherwise openai)
bun run profile:init
# codex bootstrap (defaults to codexplan and ~/.codex/auth.json)
bun run profile:codex
# openai bootstrap with explicit key
bun run profile:init -- --provider openai --api-key sk-...
# ollama bootstrap with custom model
bun run profile:init -- --provider ollama --model llama3.1:8b
# codex bootstrap with a fast model alias
bun run profile:init -- --provider codex --model codexspark
# launch using persisted profile (.ExCoder-profile.json)
bun run dev:profile
# codex profile (uses CODEX_API_KEY or ~/.codex/auth.json)
bun run dev:codex
# OpenAI profile (requires OPENAI_API_KEY in your shell)
bun run dev:openai
# Ollama profile (defaults: localhost:11434, llama3.1:8b)
bun run dev:ollamadev:openai, dev:ollama, and dev:codex run doctor:runtime first and only launch the app if checks pass.
For dev:ollama, make sure Ollama is running locally before launch.
What Works
- All tools: Bash, FileRead, FileWrite, FileEdit, Glob, Grep, WebFetch, WebSearch, Agent, MCP, LSP, NotebookEdit, Tasks
- Streaming: Real-time token streaming
- Tool calling: Multi-step tool chains (the model calls tools, gets results, continues)
- Images: Base64 and URL images passed to vision models
- Slash commands: /commit, /review, /compact, /diff, /doctor, etc.
- Sub-agents: AgentTool spawns sub-agents using the same provider
- Memory: Persistent memory system
What's Different
- No thinking mode: Provider extended-thinking modes may be disabled depending on the model
- No prompt caching: Provider-specific cache headers are skipped when unsupported
- No beta features: Legacy beta headers are ignored for OpenAI-compatible providers
- Token limits: Defaults to 32K max output — some models may cap lower, which is handled gracefully
How It Works
The shim (src/services/api/openaiShim.ts) sits between ExCoder and the LLM API:
ExCoder Tool System
|
v
Internal message format (ScrapeGoat-shaped, duck-typed)
|
v
openaiShim.ts <-- translates formats
|
v
OpenAI Chat Completions API
|
v
Any compatible model
It translates:
- Internal message blocks → OpenAI messages
- tool_use/tool_result → OpenAI function calls
- OpenAI SSE streaming → internal stream events
- System prompt arrays → OpenAI system messages
The model backend is abstracted away from the tool system.
Model Quality Notes
Not all models are equal at agentic tool use. Here's a rough guide:
| Model | Tool Calling | Code Quality | Speed |
|---|---|---|---|
| GPT-4o | Excellent | Excellent | Fast |
| DeepSeek-V3 | Great | Great | Fast |
| Gemini 2.0 Flash | Great | Good | Very Fast |
| Llama 3.3 70B | Good | Good | Medium |
| Mistral Large | Good | Good | Fast |
| GPT-4o-mini | Good | Good | Very Fast |
| Qwen 2.5 72B | Good | Good | Medium |
| Smaller models (<7B) | Limited | Limited | Very Fast |
For best results, use models with strong function/tool calling support.
License
MIT License. See LICENSE file for details.