vwf — Spec → Plan → Execute for Claude Code
vwf is the flagship plugin of the virajp-plugins marketplace — an
opinionated workflow that turns a vague feature request into shipped, reviewed
code through three disciplined phases:
- Spec — keep an always-current blueprint of the whole product.
- Plan — diff the spec against the real code for one slice, and write the delta to apply.
- Execute — implement the plan under strict TDD, then run code review and security review behind approval gates.
You drive it with slash commands. Claude does the work, asks one question at a time, and never writes until you approve.
The same marketplace also ships a handful of
supporting plugins (coding-standard skills + language
servers) and a statusline — all installable through one CLI,
@askviraj/ai-plugins.
Caveats
vwf is deliberately heavyweight. Know what you're signing up for before
adopting it.
Model & cost
- Built for Opus with the 1M context window. Every command runs on
opus, and the orchestrator holds a lot at once — the spec, the plan, the registry, and each subagent's output. Run Claude Code on Opus with the 1-million-token context (claude-opus-4-8[1m]); a smaller model or the standard window will degrade or overflow on a real cycle. - High token cost. opus everywhere, and each
executecycle spawns several subagents (coder, code review, security review) with fix loop-backs. Expect a meaningful spend per slice — this is not a cheap workflow.
Dependencies
- Hard external prerequisites.
rtkandgraphifymust be on yourPATH— thertk hook claudeBash hook fails withoutrtk. Dependency auto-install/enable needs Claude Code ≥ 2.1.143. See Prerequisites. - Memory degrades silently.
vwfrecalls and persists throughmempalace. If it's unavailable, every memory step is skipped by design — no error, but cross-cycle recall is lost (surfaced gaps still survive in the plan doc). - Leans on review engines.
execute's code- and security-review stages run on the/code-reviewand/security-reviewengines.
Fit
- High-touch by default — autonomy is opt-in. The default flow asks one
question at a time and stops at a mandatory approval gate at every stage; plan
for an interactive session.
/vwf:autopilottrades those gates for an unattended end-to-end run of a single plan — it still runs code, code review, and security review per step, but takes the decisions itself and stops only on a hard halt, a resource cap, an all-blocking gap, or an irreversible decision. Reach for it only when the plan is solid and you're ready to review a finished worktree. - Requires a testable project.
executeenforces non-negotiable TDD and a coverage gate. A project without a test runner and coverage tooling won't fit the execute stage. - Assumes a registry-described workspace.
planandexecutemap each slice to a project in the architecture registry and read its code (submodules included). You model the codebase with/vwf:architecturefirst; it won't operate on an ad-hoc folder. - Solo / small-team focus. It is highly opinionated — one workflow, one set of conventions. Great for a solo dev or small team; not a configurable framework for a large org.
Prerequisites
vwf shells out to a few external tools. Install them first — the installer
checks for each and prints the exact command for anything missing.
| Tool | Why | Install |
|---|---|---|
| mise | resolves the toolchain | brew install mise |
| node + pnpm | context7 MCP server; the npm→pnpm hook |
mise use -g node@latest pnpm@latest |
| Claude Code CLI | hosts the commands | mise use -g claude-code@latest |
| rtk | the rtk hook claude Bash hook |
brew install --formulae rtk |
| graphify | knowledge graph the commands rely on | mise use -g pipx:graphifyy@latest |
| uv | runs the mempalace memory server |
mise use -g uv@latest |
vwf also depends on four plugins — context7, markdown, mempalace, and
mise — all resolved from the same virajp-plugins marketplace. Claude Code
auto-installs and auto-enables them when you enable vwf (requires Claude
Code ≥ 2.1.143).
Install
# Installs vwf + its plugin dependencies, and wires up graphify
pnpx @askviraj/ai-plugins --user vwfInstalling outside a git repo works too: graphify install still runs, and its
repo-scoped post-commit hook is skipped automatically (with a note).
Restart Claude Code afterward so the commands, hooks, and dependencies load.
(The examples here use pnpx; if you don't use pnpm, swap in npx.)
The mental model
The three phases map to three questions:
- Spec answers what should the whole product be? — permanent, product-wide, organized by entity.
- Plan answers what changes for this one slice, and in what order? — a diff, not a re-spec, scoped to a single entity or section.
- Execute answers is it built, correct, and safe? — TDD, then review.
flowchart TD
A["1 · /vwf:architecture"] --> B["2 · /vwf:spec"]
B --> C["3 · /vwf:plan"]
C --> D["4 · /vwf:execute<br/>(gated)"]
C e1@-. "autonomous alternative" .-> F["4 · /vwf:autopilot<br/>(one worktree)"]
D --> E["5 · /vwf:archive"]
F e2@-. "you review & land" .-> E
D e3@-. "spec/plan gaps" .-> B
D e4@-. "spec/plan gaps" .-> C
F e5@-. "spec/plan gaps" .-> B
F e6@-. "spec/plan gaps" .-> C
e1@{ animate: true }
e2@{ animate: true }
e3@{ animate: true }
e4@{ animate: true }
e5@{ animate: true }
e6@{ animate: true }
architecture runs once to bootstrap; then you loop
spec → plan → execute → archive per slice — or swap execute for autopilot
to run the approved plan unattended (it commits into a dedicated worktree for
you to review and land, and never archives). Either way, when execution exposes
a hole in the spec or plan, vwf captures it and loops back to fix the source —
never silently working around it.
The documents it maintains
vwf keeps everything in version-controlled Markdown under docs/. The spec is
the desired state; the plans are the changes you apply to reach it.
docs/
├── specs/ # the always-current blueprint (desired state)
│ ├── architecture.md # system shape + machine-readable Project Registry
│ ├── conventions.md # cross-cutting decisions (auth, errors, …)
│ └── <entity>.md # one doc per entity (or an <entity>/ folder)
└── plans/ # per-cycle plans (the diff to apply)
├── <date>-<time>-<slice>.md
└── archived/ # retired, completed plans
Each entity doc holds the full-stack picture for that entity — stable product
intent at the top, volatile engineering detail (data model, API, jobs, screens)
below a marker. The Project Registry in architecture.md is a yaml block
that spec and plan parse to map an entity's sections to the right project by
type — so the workflow stays stack-agnostic.
Commands
| Command | Model | What it does |
|---|---|---|
/vwf:architecture |
opus | Bootstrap or update the system shape + Project Registry |
/vwf:spec [entity] |
opus | Maintain the full-product blueprint, one doc per entity |
/vwf:plan [slice] |
opus | Write a reviewable cycle plan — a diff of spec vs code |
/vwf:execute [mode] |
opus | Implement the plan under TDD, then code + security review |
/vwf:autopilot [plan] |
opus | Autonomously run one plan end to end — no per-stage gates |
/vwf:archive [plan] |
sonnet | Retire a completed plan into docs/plans/archived/ |
/vwf:handoff <name> |
opus | Capture the session so work resumes in a fresh one |
/vwf:recall <name> |
sonnet | Resume from a handoff in a fresh session |
/vwf:git-workflow |
— | Internal — worktree isolation, commits, merges |
/vwf:architecture
Run this first. It elicits your system's shape — projects, their types and
stacks, how they interconnect, where they deploy — and writes
docs/specs/architecture.md, including the machine-readable Project Registry
the other commands depend on. Re-run it any time the topology changes; it asks
only about genuine deltas, never re-eliciting what's confirmed.
This is the one doc that does name technologies and infrastructure — the spec deliberately doesn't.
/vwf:spec
Maintain the desired end state of the whole product, one entity at a time:
/vwf:spec order
spec reads the registry, works out which engineering surfaces apply to the
entity (data model, API, jobs, screens), and elicits the gaps with you. It then
writes docs/specs/order.md and updates conventions.md for any cross-cutting
decision raised.
A fresh reviewer subagent then checks the doc against a completeness
checklist and returns NO GAPS or a numbered list. Gaps loop back to you for
the specific open decisions, then re-review — until the doc passes. The spec is
permanent and product-wide; it is never feature-scoped.
/vwf:plan
Produce a reviewable plan for one slice of the spec:
/vwf:plan order
/vwf:plan order/api # just one section of the entity
A plan is a diff. plan reads the desired state (the spec slice +
conventions + registry) and the actual state (the real code the registry maps
the slice to), then writes only the delta — what exists, what's missing, what
changes, and the order to do it in — to docs/plans/<date>-<time>-<slice>.md.
Steps are ordered for TDD: each names the failing test that defines "done". If
the spec implies a surface the code lacks, plan flags it as drift rather than
quietly resolving it. You approve the plan before any code is written.
/vwf:execute
Implement an approved plan. Execution is mechanical from the plan, but strict:
/vwf:execute # full pipeline from the start
/vwf:execute review # jump to a stage (the prior stage must be complete)
It runs three stages, each in a fresh purpose-built subagent, each behind a mandatory approval gate:
| Stage | Model | What happens |
|---|---|---|
| code | sonnet | Implements the plan under TDD (RED → GREEN → REFACTOR) to the coverage gate |
| review | opus | Adversarial code review against the plan, spec, conventions, and stack |
| security | opus | Threat-models the change against the project's declared capabilities |
flowchart TD
C["code — TDD, coverage gate"] --> G1{"approve"}
G1 -->|yes| R["review — code review"]
G1 -->|findings| C
R --> G2{"approve"}
G2 -->|yes| S["security — security review"]
G2 -->|findings| C
S --> G3{"approve"}
G3 -->|yes| RC["reconcile + merge"]
G3 -->|findings| C
vwf never chains stages automatically — it pauses for your approval at every
gate. Review and security findings loop back to code to fix, then re-review.
When a stage exposes a gap (a hole in the spec or plan, not a code bug),
it's recorded — to the plan doc and to memory — and reconciled at the end of the
cycle, where vwf offers to fix the spec (/vwf:spec) or re-derive the plan
(/vwf:plan). It then reconciles the architecture registry and offers to
archive the plan.
/vwf:autopilot
The autonomous counterpart to /vwf:execute. It runs one approved plan to
completion without the per-stage approval gates — taking those decisions from a
fixed rule set and stopping only at a few defined pause points.
/vwf:autopilot # the single active plan
/vwf:autopilot 2026-06-26-1430-order.md
What it does, by rule:
- One plan, one worktree. Isolates all work in a dedicated git worktree and commits every step itself. It never merges, pushes, or archives — you get a finished worktree to review and land.
- Whole plan, dependencies first. Implements every step, ordered so prerequisites land before dependents.
- Full pipeline each step.
code → review → security, looping findings back to code. Security findings are always fixed; code-review findings loop up to 4 rounds, after which any residual is recorded as a gap — the spec/plan wasn't thorough enough. - Gaps don't stop it. Each gap is written to
docs/plans/<plan>.gap-report.mdand to memory, and the run continues.
It pauses only on: a hard halt (no plan/spec, a test harness that can't run,
an unresolvable git conflict); a resource cap — context > 65%, 5-hour > 90%,
or 7-day > 80% — where it hands off and stops (resume with /vwf:recall); a gap
that blocks all remaining work; or a decision the rules don't cover that is
irreversible. At the end, if any gaps remain it asks you to resolve them — and
never suggests archiving.
The resource-cap pause is delivered by the
statusline caps hook — a command can't measure its own
context window, so install the statusline (--statusline) before an autonomous
run or that pause won't fire.
Autopilot is the sharp tool: point it at a plan you trust, and review the worktree before you merge.
/vwf:archive
Move a finished plan out of the active set into docs/plans/archived/. It never
deletes. Run it manually, or accept the offer at the end of execute.
/vwf:archive
/vwf:handoff and /vwf:recall
Long sessions lose fidelity. When the context window grows beyond ~60%, capture the session so a fresh one can continue:
/vwf:handoff auth-refactor # write a handoff, file it to memory
handoff first tidies the tree — it checkpoints pending work everywhere
(the outer repo and any submodules) as wip: commits, updates any submodule
pointers in the outer repo, and removes only fully-merged worktrees (never one
with unmerged work). It does not push. Then it writes a structured handoff
document — goal, current state, key decisions, open next steps, and (when
there's a clear next action) a ready-to-paste next prompt — and stores it in
mempalace under your project. In a new session:
/vwf:recall auth-refactor # rebuild context, then optionally run the next prompt
recall retrieves the handoff, reads the files it points to, summarizes where
you left off, and offers to run the captured next prompt. If mempalace is
unavailable, handoff falls back to docs/handoffs/<name>.md and recall
reads it from there.
/vwf:git-workflow
Internal — you rarely invoke it directly. The other commands route all git
actions through it: it isolates work in a git worktree (always the outermost
superproject, never a submodule), initializes it with the repo's worktree:init
(or setup:all) mise task, commits with conventional messages, and ends a
worktree with full coverage — landing the branch (plus any submodule work and
pointer updates), then removing it. It never pushes without your explicit
request.
How it asks questions
vwf is deliberately conversational. architecture, spec, and plan share
one elicitation protocol:
- Explore first — read the docs, code, and recent commits before asking anything; never ask what the registry or code already answers.
- One decision per round — multiple-choice with an "Other" escape hatch; each answer shapes the next question.
- Only real decisions — if exactly one idiomatic answer exists, it proceeds without asking. It never guesses an open decision — it records it instead.
- Propose 2–3 approaches — with trade-offs and a recommendation, before settling a direction.
- Hard gate — it presents the shape and waits for your approval before writing anything, however small the change looks.
Memory
vwf uses the mempalace plugin as cross-session memory so each cycle builds
on the last instead of re-deriving it. It recalls prior decisions and findings
before working, and persists durable outcomes after. Memory is keyed by your
project (the wing) and split into rooms:
| Room | Holds |
|---|---|
decisions |
design/architecture decisions and the why |
problems |
review and security findings and how they were resolved |
planning |
plan rationale and deferred options |
gaps |
spec/plan holes surfaced during execution, and their fixes |
handoff |
session handoffs for /vwf:handoff and /vwf:recall |
Memory is best-effort: if mempalace is unavailable, vwf skips every memory
step and proceeds. Gaps are also mirrored into the plan doc, so they survive a
memory outage. See docs/mempalace.md.
A worked walkthrough
A first slice, end to end. Assume a backend service with an order entity.
# 1. Bootstrap the system shape and registry (once per workspace)
/vwf:architecture
# 2. Specify the order entity — answer the questions, approve the doc
/vwf:spec order
# → writes docs/specs/order.md, gated by the completeness reviewer
# 3. Plan the first slice — review the diff, approve it
/vwf:plan order
# → writes docs/plans/2026-06-26-1430-order.md (TDD-ordered steps)
# 4. Execute — approve at each gate
/vwf:execute
# → code (TDD) → [approve] → review → [approve] → security → [approve]
# → reconcile registry + any gaps → merge via git-workflow
# 5. Archive the completed plan
/vwf:archive
From here, loop steps 2–5 per slice. Update the spec when the product changes;
re-run architecture only when the system's shape changes.
vwf skills
Two skills back the workflow's quality. You don't invoke them directly — they inform how Claude writes and reviews:
karpathy-guidelines— behavioral rules that reduce common LLM coding mistakes: think before coding, keep changes surgical, surface assumptions, define verifiable success criteria.rest-api-design— technology-agnostic REST API principles (versioning, error formats, pagination, auth, OpenAPI), applied whenever the spec or plan touches an API surface.
Tips
- Run
architecturefirst.specandplanhalt without a registry. - Keep slices small. One entity or one section per plan/execute cycle keeps reviews sharp and the diff reviewable.
- Trust the gates. Read what each stage reports before approving — the approval is the point, not a formality.
- Hand off early. A handoff written at 60% context is worth far more than one squeezed out at 95%.
Supporting plugins
The marketplace ships additional plugins — opinionated coding-standard skills and language servers. Most auto-apply by file path; install only the ones for your stack. Each has a dedicated guide:
| Plugin | What it provides | Install |
|---|---|---|
| markdown | Always-on Markdown/documentation standards (auto-applies to **/*.md) |
--user markdown |
| typescript | Effect-TS coding standards (7 skills) + the TypeScript/JavaScript language server | --user typescript |
| flutter | Flutter/Dart (GetX) standards (8 skills) + bundled Dart/Kotlin/Swift language servers; project-scoped | --project flutter |
| mise | mise standards (the .config/ three-file split + task library) + a /mise:scaffold command |
--user mise |
| context7 | The Context7 MCP server — up-to-date library docs on demand | --user context7 |
| mempalace | AI memory system (external; also a vwf dependency) |
--user mempalace |
pnpx @askviraj/ai-plugins --user typescript --user markdownStatusline
A standalone, powerline-style statusline (main two-line bar + subagent panel),
fully data-driven from JSON and themeable across three config layers (defaults →
~/.config/statusline.json → <repo-root>/.config/statusline.json). It
installs through the same CLI — not the plugin marketplace — copying the script
to ~/.claude/scripts/ and writing the chosen key(s) into
~/.claude/settings.json. Requires a Nerd Font.
# install the statusline (both the main bar and the subagent panel)
pnpx @askviraj/ai-plugins --statuslineInstalling the statusline (--statusline) also wires a context & rate-limit
caps hook — it pauses long /vwf:autopilot runs at budget thresholds (context
over 65%, 5-hour over 90%, 7-day over 80%) by triggering a handoff.
See docs/statusline.md for setup and the full configuration reference.
The installer CLI
@askviraj/ai-plugins
drives the Claude Code CLI: it adds the virajp-plugins marketplace
(user-scoped) and installs each plugin at its scope. It only ever registers and
refreshes virajp-plugins — every plugin resolves from it alone.
# Everything: all user-scoped plugins + the statusline
pnpx @askviraj/ai-plugins --all --statusline
# Just the user-scoped plugins (no statusline)
pnpx @askviraj/ai-plugins --all
# Named plugins, at user or project scope (flutter is project-scoped)
pnpx @askviraj/ai-plugins --user vwf --project flutter
# Versions: CLI, statusline, and each plugin's installed-vs-latest (with scope)
pnpx @askviraj/ai-plugins --version
# Upgrade installed plugins + refresh the statusline
pnpx @askviraj/ai-plugins --upgrade
# Idempotent install + upgrade — safe to drop in a setup script
pnpx @askviraj/ai-plugins --all --statusline --upgrade
# Uninstall (mirrors the install flags)
pnpx @askviraj/ai-plugins --uninstall --user vwf
pnpx @askviraj/ai-plugins --uninstall --all --statuslineNotes:
--allacts on user-scoped plugins only.flutteris project-scoped — install it explicitly with--project flutterfrom within the project that needs it.- Scope is chosen by the flag:
--user <name>installs at user scope,--project <name>at project scope (you can mix both in one run). The marketplace add is always user-scoped. - The installer checks every required external tool for what you're installing and prints the install command for anything missing — it never installs a dependency for you.
Credits & acknowledgements
This project is a thin layer over a lot of excellent work. It would not exist — or would be far poorer — without these. Thank you to their authors and maintainers.
- Claude Code by Anthropic — the host these plugins, hooks, and statusline plug into.
- MemPalace — the AI memory system
that powers
vwf's cross-session recall (re-listed here as a dependency). - Context7 by
Upstash — the MCP docs server behind the
context7plugin. - mise by Jeff Dickey — resolves the toolchain the plugins and hooks depend on.
- pnpm — the package manager the
npm→pnpmhook andcontext7rely on. - typescript-language-server, the Dart SDK, kotlin-lsp, and SourceKit-LSP — the engines behind the language-server plugins.
- rtk (Rust Token Killer) — the
token-saving proxy
vwf's Bash hook shells out to (installed viabrew install --formulae rtk). - graphify — the knowledge-graph
tool
vwfintegrates with. - oclif — the framework this installer CLI is built on.
- Nerd Fonts — the glyphs that make the statusline render, and the Gruvbox palette it ships by default.