self-improvement

@metaharness/darwin
Released
5d ago
Version
0.7.0
Freeze the model, evolve the harness. Two measured applications: (1) SWE-bench Lite code-repair — 7.7% open-loop -> 58.3% via cheap->frontier tiering (official swebench Docker, verified), ~$0.01-$0.74/instance vs $1-20 for frontier agents; (2) Darwin Shie
llm cost-optimization llm-optimizer cheap-llm compute-arbitrage agent-harness +6
@okasputra/agent-prime
Released
yesterday
Version
1.9.0
Generate project-specific, cross-harness coding-agent guidance (AGENTS.md, skills, safety hooks) for any repo. Zero-dependency Node.
agents codex agent-guidance developer-tools skills agent-memory +4
paperthin
Released
yesterday
Version
0.6.0
Plain-Markdown skills that turn old engineering wisdom into reflexes your agent reaches for on its own — on any agent.
agent-skills agent-agnostic ai-agents anti-slop single-source-of-truth ssot +12
evomap-opencode-plugin
Released
4d ago
Version
0.1.0
Official EvoMap Evolver plugin for OpenCode: persistent, auditable agent evolution memory powered by GEP.
opencode opencode-plugin evomap evolver gep agent-memory +1
ninja-terminals
Released
6d ago
Version
2.4.9
MCP server for multi-terminal Claude Code orchestration with DAG task management, parallel execution, and self-improvement
claude claude-code ai terminal orchestrator agents +4
@plune-ai/cairn
Released
5d ago
Version
0.5.0
Cairn — an AI that walks your system and leaves a trail of tests. Autonomous QA agent (UI today; API/unit/docs planned) powered by Claude/OpenRouter, with self-improvement via Langfuse.
cairn qa testing test-generation playwright ui-testing +4
@phuetz/code-buddy
Released
3d ago
Version
1.6.1
Open-source multi-provider AI coding agent for the terminal, desktop, and HTTP. 15 LLM providers (Grok, Claude, ChatGPT, Gemini, Ollama, LM Studio, …) with ~110 tools, a peer-to-peer fleet, opt-in self-improvement, multi-channel messaging, and a skills sy
cli agent text-editor code-buddy ai coding-assistant +14
@tangle-network/agent-runtime
Released
2d ago
Version
0.79.2
Shared task-lifecycle skeleton for agents: a recursive loop kernel for chat turns, one-shot tasks, and multi-attempt loops, with trace capture and eval-gated self-improvement. Domain behavior lives in adapters; scoring and ship-gates in @tangle-network/ag