Freeze the model, evolve the harness. Two measured applications: (1) SWE-bench Lite code-repair — 7.7% open-loop -> 58.3% via cheap->frontier tiering (official swebench Docker, verified), ~$0.01-$0.74/instance vs $1-20 for frontier agents; (2) Darwin Shie
Generate project-specific, cross-harness coding-agent guidance (AGENTS.md, skills, safety hooks) for any repo. Zero-dependency Node.
Plain-Markdown skills that turn old engineering wisdom into reflexes your agent reaches for on its own — on any agent.
Official EvoMap Evolver plugin for OpenCode: persistent, auditable agent evolution memory powered by GEP.
MCP server for multi-terminal Claude Code orchestration with DAG task management, parallel execution, and self-improvement
Cairn — an AI that walks your system and leaves a trail of tests. Autonomous QA agent (UI today; API/unit/docs planned) powered by Claude/OpenRouter, with self-improvement via Langfuse.
Open-source multi-provider AI coding agent for the terminal, desktop, and HTTP. 15 LLM providers (Grok, Claude, ChatGPT, Gemini, Ollama, LM Studio, …) with ~110 tools, a peer-to-peer fleet, opt-in self-improvement, multi-channel messaging, and a skills sy
Shared task-lifecycle skeleton for agents: a recursive loop kernel for chat turns, one-shot tasks, and multi-attempt loops, with trace capture and eval-gated self-improvement. Domain behavior lives in adapters; scoring and ship-gates in @tangle-network/ag