Modonome
The autonomous engineering loop that arms only on your command, sends every change through an independent checker, and keeps your tests at full strength.
When armed, it finds tech debt your team keeps deferring and proposes bounded pull requests, with a CI gate that keeps every test assertion intact. Maker, checker, and merge authority are structurally separate, enforced in CI. Off by default, and it runs without a central service.
Website · Quickstart · Adoption guide · Enterprise · Security · Governance · Compliance · Specification · AgentProof
Autonomous coding agents have a predictable failure mode: they weaken gates to go green (removing test assertions, adding skips, loosening type checks). Modonome blocks that in CI: the anti-gaming ratchet runs from a base-branch copy the agent's run does not control, and it rejects diffs that weaken a gate. We published the governed-autonomy spec, and Modonome is the reference implementation for agent gate integrity, scoring 25/25 on AgentProof (hardening against known gaming patterns, not a certificate of full autonomy governance).
Why businesses adopt Modonome
Engineering teams commonly report a large share of capacity going to tech debt work: test gaps, stale dependencies, dead branches, type safety holes, observability gaps. Modonome targets the bounded, provable portion of that backlog (Tier 1 and Tier 2 work). It is off by default and dry-run first. Once an owner arms it, every change is small, test-fenced, independently checked by a separate role, and gated before it can merge. It adopts your existing CI, code owners, and branch rules on day one, and adds no new platform or service.
Support for mainframe, SAP, Oracle, Salesforce, ServiceNow, low-code, and data estates is on the roadmap, not shipped today. See ENTERPRISE.md for the design and docs/CLAIMS-AUDIT-2026-06-25.md for what is enforced now.
Try it in 60 seconds (read-only)
npx modonome dry-run .Modonome reads your repo, detects your stack and gates, and prints the work it would propose. It writes nothing. When you are ready, scaffold the local state files (still disabled and dry-run):
npx modonome scaffold .This prints a preview. Add --write to apply the files:
npx modonome scaffold . --writeSee the walkthrough: one week on a real Node.js app. What the dry-run proposed, what the ratchet blocked, and what the end-of-week report showed. No setup required to read it.
Defaults that stay in your control
- Autonomy stays off until you arm it, through an owner-only step in your CI or environment.
- Auto-merge stays off; a separate merge authority lands changes only when every gate is green.
- Protected paths (CI, secrets, schemas, migrations, lockfiles, auth) wait for owner review.
- Model spend stays opt-in; local or already-paid models come first.
- Cross-repo sharing stays off until you enable it.
How it works
- Adopt. Read the host repo's instructions, CI, code owners, gates, and conventions, then defer to them.
- Dry-run. Propose bounded work as a queue, read-only.
- Make. A maker implements one tightly scoped packet with a failing test as the fence.
- Check. An independent checker, separate from the maker, runs the gates and reviews the diff.
- Gate. Deterministic gates and the anti-gaming ratchet run in CI, outside the agent.
- Owner. Protected paths and new claims wait for a human decision.
- Merge. A separate merge authority, distinct from the author, lands the change only when every gate is green.
- Learn. Real corrections become staged lessons that an owner promotes into durable rules.
Modonome is a prompt and a set of scripts. Running autonomously requires a harness: a coding agent, a CI job, or a human session that loads the prompt.
See ARCHITECTURE.md for the full picture and prompts/modonome.bundle.md for the engine definition.
How it learns and keeps up
When a gate fails, a reviewer corrects the engine, or a change gets reverted, a follower role
captures one generalized, evidence-backed lesson and stages it in .modonome/LEARNINGS.md.
An owner promotes durable lessons into canonical rules, config, or tests, then adds a
deterministic gate when one fits. The queue stays capped, dated, and owner-controlled. The
engine rewrites its own rules only with a human in the loop. Promoted lessons are validated in
CI for full traceability (scripts/check-learning-traceability.mjs) and are queryable with
npm run audit:learnings. A market-researcher role that watches for standards and dependency
shifts is on the roadmap, not yet implemented.
Why is this different from prompting an agent directly?
You can tell an agent to add tests. The agent can also remove assertions to make the tests pass faster. Modonome handles this structurally: the ratchet that catches assertion removal runs in CI from a base-branch copy the agent's run does not control. The arming levers live in environment variables, outside the agent's read scope. A prompt can be overridden by a cleverer prompt; a CI gate that runs outside the agent's write scope holds.
Why it is safe to run
The controls live in code that runs in CI. The anti-gaming ratchet and the house-style
linter run from a trusted base-branch copy; the drift guard, self-application conformance,
work-item validation, learning-traceability, promotion-readiness, and checker-engagement
checks also run in CI, and every enforcing script is protected by CODEOWNERS review. The
arming levers are gated by the MODONOME_ARMED environment variable, enforced at runtime:
with it unset, autonomy_enabled is forced to false no matter what the config file says. The
levers are read from your environment or CI, never from a file the engine can rewrite.
AgentProof proves this with 25 adversarial scenarios: assertion removal, skip injection, type escape, coverage removal, unsafe config combinations, identity collapse, raw code leakage, drift, protected-path bypass, Java and .NET ratchet coverage, prompt injection inertness, state-machine acyclicity, deterministic gate ordering, trust-boundary code loading, audit-trail integrity, model-family distinctness, concurrency safety, gate-dependency DAG validation, evidence secret screening, and resource-exhaustion caps. Modonome scores 25/25. Run it yourself:
node agentproof/runner.mjsRead SECURITY.md, GOVERNANCE.md, and GOVERNED-AUTONOMY-SPEC.md.
Embed it
- Reference: link to the prompt and keep your state local.
- Vendor: copy
prompts/,templates/,schemas/, andscripts/into your repo and pin a release tag. - Package: import the schemas and scripts, keep config and state local.
Upgrades preserve your config. New levers always arrive with safe defaults, so an update leaves an engine disarmed unless an owner arms it. See docs/VERSIONING.md.
Examples
- Demo app walkthrough: one week on a Node.js app. Dry-run, ratchet blocks, merges, and governance report. No setup required to read it.
- examples/node-typescript: Node and TypeScript service with dry-run transcript.
- examples/python-service: Python service with dry-run transcript.
Two products, one repo
Modonome Guard (v0.1, shipped today) is the guardrail layer any team can adopt in minutes:
- Anti-gaming ratchet: blocks assertion removal, skip injection, type escape, coverage removal across JS/TS, Python, Java, .NET
- AgentProof: 25/25 HARDENED adversarial benchmark for gate integrity
- Validators: config, work-item, drift, self-application, evidence, learning traceability
- CLI:
dry-run,scaffold,validate,report
Add just the ratchet to any CI pipeline in one step:
- name: Anti-gaming ratchet
run: node scripts/guard-ratchet.mjsModonome Autonomy (v0.2, roadmap) is the governed maker/checker loop. The machinery is fully
wired (modonome-auto.yml, run-cycle.mjs) and proven end-to-end on the demo app
(examples/demo-app/runs/2026-06-26T11-46-00Z/):
Haiku maker, Sonnet checker, distinct model families, checker approved with one question raised.
It has not yet run in armed mode on a live production repository. That is v0.2.
Alpha limitations (v0.1-alpha)
Modonome is in public alpha. The ratchet, CLI, MCP server, report command, and the CI governance gates (drift, self-application, work-item validation, learning traceability, promotion readiness, checker engagement) are stable and machine-verified. The two-phase maker/checker loop is structurally defined and CI-enforced, but has not yet run in armed mode on a live repository. The following capabilities are on the roadmap but not yet shipped:
| Capability | Status | Planned |
|---|---|---|
| Live armed autonomy run (engine authors and a separate checker reviews on a real repo) | Not yet | v0.2 |
| Cross-repo knowledge network (transport, signing, import) | Design only (ADRs 014-019) | v0.2 |
| Multi-stack support beyond JS/TS, Python, Java, .NET (mainframe, SAP, Oracle, and so on) | Not yet | roadmap |
| Market-researcher role | Not yet | roadmap |
| Cryptographically signed work items (Ed25519) | Not yet | v0.2 |
| OpenTelemetry span emission for governance events | Not yet | v0.3 |
| Before/after tech debt measurement | Not yet | v0.2 |
| Multi-team estate metrics aggregation | Not yet | v0.3 |
State is stored as flat files in .modonome/. This suits single-repo, owner-supervised
runs today; compliance audit trails and multi-team estates arrive with the v0.2 additions.
See ROADMAP.md.
Cost model
Modonome's cost is entirely the LLM API you use. The tool itself is zero-cost (MIT, no telemetry, no service). There is no central service call.
| Run type | Turns | Approximate API cost |
|---|---|---|
| Dry-run sweep (read-only) | 2-4 | $0.01 - $0.05 |
| Tier 1 work item (docs, tests) | 6-10 | $0.05 - $0.20 |
| Tier 2 work item (scripts, schemas) | 10-20 | $0.20 - $1.00 |
| Full autonomous cycle (5 items) | 40-60 | $0.50 - $2.00 |
Figures assume Claude Sonnet pricing at June 2026 rates. Haiku runs Tier 1 items
at roughly one-fifth the cost. Opus is appropriate for security-critical Tier 2
items. See QUICKSTART.md for how to match model tier to work item tier.
If you run modonome via the Claude Code CLI with a Claude Pro or Teams subscription (not an API key), the cost is zero beyond your subscription. VS Code with the Claude Code extension uses the same subscription-based billing.
Local development
npm run verify # drift, style, hygiene, self-application, learning, promotion, work-item,
# and checker-engagement gates, plus tests and AgentProof. No network or secrets..modonome/ in this repo is Modonome's own governance state: its work queue, its promoted
learnings, and a metrics.example.jsonl sample (the live metrics.jsonl is written by the
engine at runtime and is not committed). Adopters should run npx modonome scaffold . --write
to start fresh with their own config and empty state.
License
MIT. See LICENSE.