npm.io
0.38.3 • Published yesterdayCLI

@davesheffer/hunch

Licence
Apache-2.0
Version
0.38.3
Deps
6
Size
517 kB
Vulns
0
Weekly
5.7K

Hunch — Architectural Conformance for AI code

GitHub stars npm version npm downloads license node MCP

A linter checks whether code matches a pattern. Hunch checks whether code still matches your architecture — and blocks the AI change that breaks it, citing the decision and the past bug it would reopen. The semantic invariants pattern-SAST can't express (layering, must-reach, dependency direction), enforced deterministically over a git-native graph of why — across any MCP assistant.

npm i -g @davesheffer/hunch
cd your-repo && hunch init

# record an architectural invariant — the kind Semgrep/SonarQube structurally can't express
hunch conform --add "controllers never reach the DB directly — go through the service layer" \
  --assert not-calls --subject listOrders --object dbQuery --why "the Mar-2025 N+1 meltdown"

hunch conform --strict     # ✅/⛔ deterministic gate — wire into CI; runs on every AI change

An AI "optimizes" the controller to query the DB directly. Semgrep: green. SonarQube: green. (it's a legitimate internal import — no bad pattern.) Hunch: BLOCKED"listOrders now reaches dbQuery — VIOLATED · why: the Mar-2025 N+1 meltdown · prevents recurrence of bug_0317." See demo/architectural-conformance.sh.

It works both ways — prevent and catch — and you need both:

  • Prevent — in a reproducible benchmark (bench/: n=90, Haiku/Sonnet/Opus, 3 invariant classes), the recorded invariant in context cut architectural violations 58% → 16% overall (Sonnet 67% → 0%). But prevention is necessary, not sufficient: even Opus ignored a layering rule 60% of the time when told. Each violation passes a linter clean.
  • Catch — which is exactly why the deterministic gate exists. hunch check --strict (the pre-commit hook + the hunch ci PR gate) blocks what the model ignores — with the receipt, no model in the gate. Injection helps; the gate is the guarantee.

Works with Claude Code, Cursor, Copilot, Windsurf & Google Antigravity from one shared, git-native graph.

Read the full documentation → hunch-pi.vercel.app/docs

The docs site is the complete reference — setup, every CLI command and MCP tool, the guards, troubleshooting. This README is the tour. Jump to: Install · MCP setup · Firmness · CLI reference · Troubleshooting

The problem

Every AI coding session starts from zero. The model re-reads your code, re-guesses the intent, and happily "fixes" the thing you deliberately did last month — because the reasoning behind the code lives in PRs, Slack, and people's heads, not in the repo.

Hunch captures that reasoning as a byproduct of normal work — commits and test failures — stores it as a git-tracked graph next to your code, and feeds it back to Claude Code so every session is grounded in the decisions, bugs, and invariants that came before. Local-first, no documentation toil, no SaaS.

How it works

   commit / test failure              .hunch/  (git-tracked JSON)            Claude Code
 ┌───────────────────────┐         ┌──────────────────────────┐         ┌──────────────┐
 │ post-commit hook   ───┼────────▶│ Decisions  (why a change) │────────▶│ MCP tools     │
 │ record-bug         ───┼────────▶│ Bugs       (root causes)  │  read   │ /hunch-* cmds │
 │ structured diff +     │  write  │ Constraints(invariants)   │◀────────│ CLAUDE.md     │
 │ Claude (or heuristic) │         │ Components / Symbols/Edges │         │ CLI           │
 └───────────────────────┘         └──────────────────────────┘         └──────────────┘
  • Index (no LLM): Hunch maps your repo — how functions, files, and components connect — so it can see the ripple effect of any change.
  • Learn: each commit becomes a structured Decision (an ADR); a failing test becomes a Bug with its likely cause; recurring or severe bugs are promoted into Constraints (do-not-break rules) and flag the riskiest parts of the code.
  • Ground: any MCP assistant reads it through an MCP server, an auto-maintained CLAUDE.md, and slash commands — every answer cites provenance (source + confidence + evidence), so nothing is a blind assertion.

→ Concepts in depth: the reasoning graph · provenance · time-travel

Why Hunch is different

"Memory for coding agents" is getting crowded, but most of it is a server-side, ephemeral, single-vendor RAG cache over your current code. Hunch is the opposite on every axis — and that combination is the moat:

Typical agent memory Hunch
Storage server-side / a vendor's cloud git-tracked JSON in your repo — diff it, review it in PRs, sync it over git push
Lifetime the session; often auto-expiring the lifetime of the codebase — non-destructive supersede/veto keeps the why-it-changed trail
Clients one vendor's agent client-agnostic — one .hunch/ graph serves Claude Code, Cursor, Copilot & Windsurf via MCP
What's stored opaque extracted "facts" structured ADRs — decisions with rejected-alternatives, bug lineage, and invariants
Enforcement advisory / just-in-time hints fail-closed guards — no model in the block path; a commit fails only on a rule you've vouched for
Trust take it on faith provenance on every record (source + confidence + evidence) and a measurable retrieval signal (hunch eval)

The short version: git tracks what changed; Hunch tracks why — locally, durably, and under your control, with guards that actually hold the line instead of just suggesting.

Getting started

npm install -g @davesheffer/hunch   # Node ≥ 20; puts `hunch` on your PATH
cd your-repo
hunch init                          # scaffold .hunch/, index, install hooks, wire up assistants
hunch backfill --since 90d          # cold start: seed decisions from recent git history
hunch why src/auth/session.ts       # …then ask your assistant: "why is X built this way?"

hunch init scaffolds .hunch/, indexes the repo, installs the git hooks, writes .mcp.json + slash commands + an auto-maintained CLAUDE.md, and wires up every detected assistant (Claude Code, Cursor, VS Code/Copilot, Windsurf, Codex, Google Antigravity) to the same graph — merging idempotently into existing files. Reload your assistant in the repo afterward to pick up the hunch_* tools. Each teammate runs hunch init once; the .hunch/ content is shared via git.

Synthesis is billed to your coding-assistant subscription (Claude/Codex/Cursor CLI), never a pay-per-token API key — and falls back to a deterministic heuristic if no CLI is present. Details: Synthesis & billing.

Deep Synthesis (backfill --deep / sync --deep): gathers several independent takes on a change and reconciles them into one more-trustworthy note — trusting it more when they agree. Add --verify to fact-check the note against the commit and drop anything it doesn't support. It always stays advisory until you confirm it. Subscription-only; falls back to a single draft when only one assistant is available. On Windows, prefer hunch init over a global claude mcp add; if tools don't appear, hunch doctor heals it (why).

Full walkthrough → Getting started · MCP & assistants · MCP tools · slash commands · the full CLI reference

Enforcement: memory that holds the line

Hunch isn't just recall — it's a set of guards that stop the AI (and you) from undoing intentional design. All ride the same rails: the pre-edit hook, hunch check, the hunch_merge_verdict MCP tool, and the CI Constraint Guard. How hard they push is one committed knob — firmness (offadvisoryfirmstrict).

Never Twice — corrections become enforced invariants

You tell the agent "no, never call the pay-per-token API here," it complies once, and next session it does it again — because the feedback was stored as advisory text. Hunch closes that loop: a correction is captured as a first-class Constraint (human_confirmed) via hunch_record_correction, and from then on the same hook + CI guard hold every assistant to it. → docs

Causal Merge Verdict — does this change re-open a closed bug?

A diff-only reviewer sees what changed; it can't see that the line you're deleting is the fix for an incident. Hunch can — hunch_merge_verdict replays a diff against the graph and returns a cited BLOCK / WARN / PASS:

VERDICT: ⛔ BLOCK — this change breaks a recorded invariant or re-opens a known bug.

⛔ pay() must verify the session before charging — con_pay
   🧠 why: "Charge must verify the session first" (dec_pay)
   🐞 guards against: Double-charge on unverified session (bug_…)

No model in the loop, so it's safe as a merge gate — it blocks only on a high-confidence rule you've confirmed, and warns on everything softer. → docs

Decision Guard (Veto) — re-introducing a rejected approach is blocked

The most expensive reversal is re-adding an approach a decision rejected (latency, a forbidden dependency) — code that never existed, so a diff reviewer is blind to it. A decision remembers what it rejected; re-introduce that approach and Hunch blocks it with the receipt of what you rejected and why. → docs

Redundancy Guard — "this already exists"

An agent works from a local context window, so it re-implements a helper that already lives three modules over, or re-adds a dependency the codebase already has — sprawl a diff-only reviewer can't see, but Hunch's symbol graph can. Add a function or class already defined elsewhere and hunch check / the CI guard / hunch_merge_verdict flag it with the existing location. Advisory — it never blocks, and it's tuned to stay quiet so a refactor that just moves code isn't mistaken for a duplicate. → docs

Plus the Regression Guard (re-adding deliberately-retired code) and the CI Constraint Guard (hunch ci — a PR gate that comments the affected con_/dec_ ids and fails on a blocking one).

Name the actual violation — record-constraint "…" --scope "src/**" --severity blocking --forbid-dep "lodash" — and it blocks the real change across the file's whole life instead of relaxing to advisory after the file is edited again. The dep matcher reads the parsed import, so a comment or string naming the module can't false-positive and a submodule (lodash/groupBy) is still caught; a correction your assistant records gets the same matcher automatically. (--match <regex> remains a lint-grade textual fallback.) None of these are a bypass-proof boundary — deliberate indirection can still route around any rule.

Working as a team

The .hunch/ JSON is the source of truth — diffable, reviewable in PRs, synced for free over git push / pull. hunch init sets things up so concurrent edits from different teammates merge cleanly instead of throwing conflict markers, and it's OS-agnostic — Windows / macOS / Linux teammates share one memory with no per-machine fixups. → docs

Branches & worktrees

Memory follows you across every branch and git worktree, with no per-worktree setup — a fresh git worktree add on any branch sees the same decisions, bugs, and invariants. Create one already wired in with hunch worktree <path> [-b <branch>], or just run hunch init / hunch private once and every worktree picks it up. Parallel worktrees never corrupt or lose memory, and hunch doctor confirms a worktree is sharing.

Private memory (public repo, private context)

Open-source your code without open-sourcing your reasoning. hunch private sets up a separate private store in one command — Hunch unions it into every query and guard locally (MCP and the pre-edit hook see your sensitive decisions/bugs/constraints) while your public .hunch/ stays clean. It writes a gitignored .hunch/local.json so it's auto-detected — no env var, no shell-profile edit (and HUNCH_PRIVATE_DIR still overrides per-shell). Opt-in, default-off (no config → fully inert), and leak-safe by construction: committed files and the CI PR comment render public-only, so a private record can't reach a public surface. Record sensitive items with private: true (hunch_record_decision / hunch_record_correction); post-commit synthesis can route there too, and hunch private --auto-commit (opt-in) auto-commits + pushes each capture to the private repo — recursion-safe, staging only .hunch/.

Already published a repo with its .hunch/ memory and want it private after the fact? hunch private --repo <url> --migrate does it in one shot: it moves your existing public records into the overlay (union by id — nothing is lost), empties the public store, untracks + gitignores the .hunch/ memory tree, and regenerates the assistant grounding (CLAUDE.md, AGENTS.md, …) so the repo becomes code-only. It commits the private overlay for you and prints the one git command to commit the now-clean public repo. → docs

Continuous learning (CI)

The decision half of the loop is automatic (the post-commit hook). Light up the bug/constraint half by wrapping your test run — it captures failures as Bugs (recurrences auto-promote Constraints) and resolves fixed ones, preserving the runner's exit code:

hunch test                  # runs `npm test`; any runner: hunch test -- pytest -q

Drop npx hunch test into CI, and hunch ci to scaffold the PR merge gate. → docs

Semantic search (optional)

hunch query uses fast keyword search out of the box. For recall on paraphrases, opt into local embeddings (npm i -g @huggingface/transformers && hunch embed) — local, free, and opt-in, and it never drifts from your committed memory.

VS Code

A companion VS Code extension (on Open VSX — VS Code / Cursor / Windsurf / VSCodium) brings the graph into the editor: a tree of decisions / invariants / bugs / bug-lineage / fragility / stale records, CodeLens summaries, hover with bug history, invariants in the Problems panel, an interactive component graph, and a status-bar invariant counter. It reads the committed .hunch/ JSON directly; writes delegate to the hunch CLI.

Architecture

Everything lives under .hunch/ as plain git-tracked JSON — the source of truth; a fast local index is built from it and is throwaway. Subscription-billed synthesis (never a pay-per-token API key) with a no-LLM fallback, and atomic writes so an interrupted write can't corrupt your memory. → the docs for the conceptual model.

Develop

Hunch is open source — pure TypeScript ESM, Node ≥ 20, licensed Apache-2.0. Contributions welcome — see CONTRIBUTING.md and the repo.

Keywords