slopbrick
The AI-coding-agent drift detector. Stops the
zustand + redux in the same project, the three-modal-systems, the off-scalep-[13px], the hardcodedsk-...keys, and theexpect(x).toBeDefined()tests that no one wrote and no one will run. Runnpx slopbrickand get a single Repository Coherence score (0–100) with per-rule precision/recall so you know which findings to fix and which to ignore. Add the MCP server and your AI agent reads your existing patterns before it writes new ones.
The problem. AI coding assistants write logic well, but they drift. Every project ends up with three button variants, a hardcoded API key, inline styles next to Tailwind utilities, and a test file full of expect(x).toBeDefined(). The drift isn't the agent's fault — it's that the agent doesn't know your conventions. Existing linters catch syntax; nothing catches "you just invented a fourth modal system when this repo already has three."
What this does. slopbrick extracts the canonical patterns from your codebase (state lib, form lib, modal system, API client, data-fetching), enforces a declared Constitution at PR time, and exposes the pattern inventory to AI agents via MCP (slop_suggest) so they reuse what's there instead of inventing new patterns. The headline Repository Coherence score (0–100) is a proof that the Constitution is being followed — but the actual moat is the Constitution + Pattern Inventory itself, not the number.
What an AI agent gets from slop_suggest (the primary entry point)
The MCP tool slop_suggest returns, in one call:
- Existing patterns — the canonical modal, button, API client, state library, data-fetching library the project already uses.
- Do-not-create list — explicit
constitution.forbiddenpackages + canonical patterns not to duplicate. - Top issues by rule — what to fix first in the changed files.
- Hot files by issue count — where the slop is concentrated.
- Composite Repository Health — the headline 0-100 score.
Call this BEFORE writing new code so the agent reuses existing patterns instead of duplicating them.
src/mcp/tools.ts:69-81. The MCP server is one command: slopbrick mcp.
Headline 5-bucket score (compressed from 13 subscores)
The full diagnostic surface is 13 subscores. The user-facing surface is 5 buckets:
| Bucket | What it measures | One-line question | v4.1 calibration |
|---|---|---|---|
| Architecture Consistency | Cross-file pattern duplication + token usage + component reuse | "Are components and patterns consistent?" | 8 USEFUL rules, top signal: Pattern Fragmentation |
| AI Slop Signal | Ghost defensive code, debug logs, unused state, missing auth, hardcoded secrets | "Does this look like AI wrote it?" | 18 USEFUL rules with measured P/R/FPR |
| Security | Hardcoded secrets, dangerous CORS, unsafe HTML, SQL concat, missing auth | "Are there security holes?" | 4 USEFUL, 3 INVERTED |
| Delivery Quality | Test quality, business-logic coherence, docs freshness | "Can the team ship safely?" | 4 PASS, 3 INVERTED |
| Codebase Health | DB schema, design-token drift, product consistency | "Will this codebase hold up at scale?" | 5 INVERTED/DORMANT — calibration pending |
Repository Health (the composite) = 0.25 × Architecture + 0.30 × AI Slop + 0.25 × Security + 0.10 × Delivery + 0.10 × Codebase.
The 13-subscore diagnostic surface remains available behind slopbrick scan --format json and --format detailed for the calibration / power-user audience.
Why this works (calibration evidence, v4.1)
The rules are calibrated against a balanced 1:1 corpus:
- 95,467 human-written frontend files — 39 production repos (mui 16k, supabase 6.8k, antd 5.5k, storybook 3.5k, react-spectrum 3.3k, refine 6.3k, appsmith 5.5k, heroui 2.1k, …) + 54,980 from
ai-slop-baseline. - 76,981 AI-generated frontend files — 50 existing repos + 100 NEW shallow-cloned vibe-coded repos (Claude Code, Cursor, Lovable, Bolt, gpt-pilot, v0, BloopAI, tldraw) in
corpus-expansion/positive/vibe-coded/.
The form engineers actually trust, per-rule (full table in docs/research/v4-per-rule-pr-fpr.md):
security/missing-auth-checkfires on 0.63% of AI files and 0.04% of human files. When it fires, 92% of the time the file is AI.
Top 5 AI signals (v4.1 P/R/FPR):
| Rule | Precision | Lift | Verdict |
|---|---|---|---|
logic/ghost-defensive (dead if (x) return guards) |
94.74% | 22.5× | USEFUL |
security/missing-auth-check (auth bypass) |
92.47% | 15.3× | USEFUL |
logic/math-console-log-storm (debug logs everywhere) |
89.84% | 11.0× | USEFUL |
logic/zombie-state (unused useState) |
83.33% | 6.2× | USEFUL |
test/duplicate-setup (beforeEach copy-paste) |
70.97% | 3.1× | USEFUL |
The 18 USEFUL rules have per-file P/R/FPR thresholds in tests/integration/calibration-expanded.test.ts that fail CI when the signal drifts. See docs/research/calibration-report-2026.md for the full calibration trajectory (v1 ratio → v3 ratio → v4 ratio → v4.1 P/R/FPR).
For humans
slopbrick scanin CI gates PRs onslopIndexand the Constitution.slopbrick prscores the PR itself, weighted by severity.slopbrick architectureis the headline consistency number.slopbrick securityis the categorical security gate.slopbrick test/business-logic/patterns/dbare specialised subcommands.slopbrick driftexits 1 on any Constitution violation.slopbrick --format jsonexposes the 13-subscore diagnostic surface + per-rule P/R/FPR for the calibration audience.
For AI agents
Install the MCP server (@slopbrick/mcp), then call slop_suggest before writing new code. The agent never has to guess what's already in the codebase.
What it does not do
It does not detect whether a human or AI wrote the code. It surfaces patterns that AI generates disproportionately (4 modal systems, exposed NEXT_PUBLIC_OPENAI_API_KEY, if (NODE_ENV === 'development') return true), and enforces the constitution the project has declared.
13-score diagnostic surface (for the calibration audience)
The headline 5-bucket score is a compression of 13 subscores. The full surface is what slopbrick scan --format json returns, and what the tests/integration/calibration-expanded.test.ts test guards against drift.
Tier 1 — Deterministic (Constitution enforcement):
| Score | Shape | Use it for |
|---|---|---|
| Slop Index | 0–100 | Frontend lint quality (composite) |
| Architecture Consistency | 0–100 | Cross-file pattern duplication |
| AI Security Risk | low / medium / high / critical |
AI-induced security failures |
| Constitution drift | pass / fail | Imports that violate the declared stack |
| Design-token drift | inline violations | Spacing/radius off declared scales |
| Pattern Fragmentation | 0–100 | How many competing patterns per category (modal / auth / state / api / …) — the input to slop_suggest's doNotCreate list |
| PR Slop Score | integer | One number per PR, weighted by severity |
Tier 2 — Heuristic (specialised subcommands):
| Score | Shape | Use it for |
|---|---|---|
| Test Quality | 0–100 | AI test smells |
| Business Logic Coherence | 0–100 | Pricing precision, validation completeness |
| Documentation Freshness | 0–100 | Stale READMEs, drift between docs and code |
| Database Health | 0–100 | Missing indexes, N+1, soft-delete inconsistencies |
Tier 3 — Derived (dashboards):
| Score | Shape | Use it for |
|---|---|---|
| Repository Health | 0–100 | Weighted average of the 5 buckets |
| AI Maintenance Cost | $/month | $ cost of fixing the issues, given velocity |
| AI Debt band | A / B / C / D / F | Letter grade from the above two |
What's new in 0.7.0
Repository Constitution Engine for AI Coding Agents. The moat is the Constitution. Everything else is a score that proves it's being followed.
0.7.0 is the Constitution-first release — three new specialised subcommands (test, business-logic, patterns), a one-number-per-PR score (pr), the forbidden deny-list for explicit "never use this", and the rename from conventions to constitution everywhere.
The 0.6.x series built the foundation. The 0.7.x series turns the Constitution from a config field into a working contract — AI agents that read it via MCP (slop_suggest) and humans that enforce it via PR gates.
What landed in 0.7.0:
slopbrick pr— one weighted number per PR, gates on--thresholdslopbrick test— Test Quality score (0–100) + 4test/*rulesslopbrick business-logic— Business Logic Coherence score (0–100) + 8 rulesslopbrick patterns— Pattern Fragmentation score (0–100), 8 categoriesconstitution.forbidden— explicit deny-list (e.g.['moment', 'lodash'])conventions→constitutionrename across config, types, MCP tool names, report columns
0.6.0 – 0.6.4 recap (the foundation)
0.6.0 was the engine re-architecture. The 0.6.1 – 0.6.4 patch series shifts the framing from "AI slop detector" to repository coherence engine — the same scanner, now with three new scores, MCP tools so AI agents check before they PR, and eight security rules for AI-induced failures.
Five orthogonal scores, all in slopbrick scan:
| Score | Shape | Use it for |
|---|---|---|
| Slop Index | 0–100 | Frontend lint quality |
| Architecture Consistency | 0–100 | Cross-file pattern duplication (0.6.3) |
| AI Security Risk | low / medium / high / critical |
AI-induced security failures (0.6.4) |
| Constitution drift | pass / fail | Imports that violate declared stack (0.6.2) |
| Design-token drift | inline violations | Spacing/radius off declared scales (0.6.3) |
The slop detector is still here — but the bigger lever is coherence: one modal system, one state library, one fetch lib, a declared constitution that the AI agent checks before PR.
What landed in 0.6.1 – 0.6.4
0.6.4 — AI Security Risk (new score) + 8 Tier-1/Tier-2 security rules
NOT a security scanner — Semgrep / GHAS / CodeQL / Gitleaks own that. This is a categorical score (low | medium | high | critical) for security failures that AI generates disproportionately. Independent of slopIndex — security failures do not get diluted into "good slop score" territory.
security/hardcoded-secret— provider prefixes (sk-,sk-ant-,AKIA,ghp_,sk_live_,AIza,xox[abprs]-) + sensitive-name literals.security/exposed-env-var—NEXT_PUBLIC_*/VITE_*/ etc. with secret names — inlined into every browser build.security/dangerous-cors— wildcardAccess-Control-Allow-Origin: *+cors({ origin: '*' })+ reflectivecors({ origin: true }).security/missing-auth-check— Next.jsroute.ts/pages/api/ Express handlers with no auth primitive.security/unsafe-html-render—dangerouslySetInnerHTMLfed a non-literal value.security/fail-open-auth—if (NODE_ENV === 'development') return true/next().security/sql-construction— template-literal / concat SQL queries (use parameterized queries).security/public-admin-route— routes under/admin,/internal,/debug,/staff,/manage,/private, etc. without an additional role check.
New slopbrick security [--format pretty|json] [--strict] subcommand. --strict exits 1 on high/critical (CI gate).
0.6.3 — Architecture Consistency Score (the headline metric) + design-token enforcement
One 0–100 number that reflects how consistent a repository's patterns are. Subtracts from 100 for each pattern-duplication finding: -12 per extra modal system, -8 per extra button variant, -10 per extra API client module, -15 per extra state library (highest), -10 per extra data-fetching library, -1 per 5 off-scale spacing values, -1 per 5 off-scale radius values. A project with 1 modal, 1 button, 1 api client, 1 state lib, 1 fetch lib lands at 100. A project with 3 modal systems + 4 button variants + 2 state libs lands at 37.
Two new rules turn design tokens from docs into enforceable contracts:
visual/spacing-scale-violation— flagsp-[13px],gap-[1.75rem]etc. off the declaredspacingScale.visual/radius-scale-violation— flagsrounded-[7px],rounded-tl-[2rem]etc. off the declaredradiusScale.
Both emit auto-fix candidates so slopbrick scan --fix rewrites p-[13px] → p-1.
0.6.2 — Repository governance for AI coding agents
The single feature most projects asked for. New top-level constitution field in slopbrick.config.mjs:
export default {
constitution: {
stateManagement: ['zustand'],
dataFetching: ['react-query'],
uiLibrary: ['shadcn', 'radix'],
forms: ['react-hook-form', 'zod'],
styling: ['tailwind'],
routing: ['next'],
},
};Auto-detected from package.json when unset; user declarations always win.
slopbrick drift— CLI command, exits 1 on any violation (CI-friendly).slop_suggestMCP tool — project-wide inventory of existing patterns; AI agents call before writing new code.slop_check_constitutionMCP tool — per-file constitution diff.slop_architecture_scoreMCP tool — Architecture Consistency Score via MCP.
0.6.1 — bug fixes + small refinements
slopbrick trend --format markdownnow actually emits markdown (the local flag was being shadowed by the global scan--format; renamed to--render).- Calibration test surfaces stderr/stdout on chunk failures instead of swallowing them.
- v1.x working-tree labels stripped.
CLI surface summary (post-0.8.0)
| Command | Purpose |
|---|---|
slopbrick scan |
Main scan — runs all rules + computes all 8 scores |
slopbrick architecture |
Architecture Consistency Score only |
slopbrick security |
AI Security Risk only |
slopbrick drift |
Constitution-violation scanner |
slopbrick pr |
PR slop score (single weighted number per PR) |
slopbrick test |
Test Quality score (4 test/* rules) |
slopbrick business-logic |
Business Logic Coherence score (8 rules) |
slopbrick patterns |
Pattern Fragmentation score (input to slop_suggest) |
slopbrick maintenance-cost |
AI Maintenance Cost (categorical low/medium/high/critical + $/month) (0.8.0) |
slopbrick docs |
Documentation Freshness (4 docs/* rules) (0.8.0) |
slopbrick db |
Database Health (6 db/* rules, Postgres-only) (0.8.0) |
slopbrick mcp |
MCP server (slop_scan_file, slop_explain_rule, slop_list_rules, slop_suggest, slop_check_constitution, slop_architecture_score) |
slopbrick trend |
Slop Index trend over time |
slopbrick flywheel |
Aggregated scan telemetry |
slopbrick init |
Interactive setup wizard |
What 0.6.x did not change
- No new competitor overlap. We did not add a general security scanner, dependency vulnerability scanner, formatter, type checker, or coverage tool.
- No breaking CLI changes. Existing scan commands, JSON / SARIF / HTML output formats, and public-API exports are unchanged.
The full release history is in CHANGELOG.md.
Why this matters (research-backed)
The 0.7.0 release sits on top of an industry that's converging fast on AI-generated-code debt. The numbers below are from 2024–2026 studies and explain why the Constitution, not the Slop Index, is the moat.
- AI slows experienced developers. METR's July 2025 RCT (16 experienced open-source devs, 246 tasks on repos averaging 22k stars / 1M LoC) found AI tools produced a 19% slowdown — developers had expected a 24% speedup. (METR, 2025)
- AI-generated code carries 1.7× more issues per PR (10.83 vs 6.45) and a higher share of critical/major issues. (CodeRabbit, 2025)
- Refactoring is collapsing. GitClear's 211M-line analysis of Google/Microsoft/Meta repos shows "refactored" lines fell from 25% → <10% and "copy-pasted" lines rose from 8.3% → 12.3% between 2021–2024. (GitClear, 2025)
- PR size is up 51%, bugs/PR up 28%, incidents/PR up 3×, code churn up 10× across 22k developers in 2026. (Faros AI, 2026)
- Trust in AI accuracy dropped from 40% → 29% in one year; 66% of devs spend more time debugging AI output. (Stack Overflow 2025 Developer Survey)
- Code-surface doc-surface staleness is an open hole. No shipped tool (Docusaurus, Mintlify, GitBook, mkdocstrings, TypeDoc) cross-references
package.jsonREADME or exported names markdown inline code. The 2026 state-of-the-art is F1 = 96.73% on a single analog task (description-code inconsistency, arXiv 2606.04769). That's whatslopbrick docsships against in 0.8.0. - Schema-quality static analysis is an open hole for Drizzle. The official
eslint-plugin-drizzlehas exactly 2 rules. Prisma has 8+ Prisma-Lens rules but they target per-file linting, not schema-wide drift. Squawk owns migration safety; nobody owns schema quality. That's the wedge forslopbrick dbin 0.8.0. - The canonical AI-coding-agent failure is the AWS Kiro outage (Dec 2025). An agentic coding tool autonomously deleted a production environment; 13-hour outage in a China region. (Docker blog, 2026) The post-mortem: "predictable given unchecked AI permissions." The preventive: a Constitution the agent checks before it acts, with a
$306,000/yr/MLoCbaseline for what the debt costs when it isn't checked. (Sonar, 2025)
The full research notes for each 0.8.0 phase are in docs/research/.
Mathematical foundations — the peer-reviewed methods behind every threshold:
docs/research/math-foundations-for-slopbrick.md maps 8 published results (Halstead 1977, Hindle 2012, Rissanen 1978, Kullback-Leibler 1951, Blondel 2008, Fiedler 1973, McCabe 1976, Adams-MacKay 2007) to the slopbrick rules and composite scores that cite them. v0.9.3+ ships the highest-leverage ones (Halstead, Code Naturalness, MDL composite) to replace heuristic thresholds with closed-form citations.
v0.10 implementation plan — the credibility-milestone roadmap with dependency graph, effort estimates per phase, and readiness checklist: docs/research/v0.10-implementation-plan.md. Phases 1–5 (~4 working days) ship v0.10; Phases 6–11 land the far-horizon graph-theoretic, Repository Memory, --diff, find_similar_function, BRICK, and SARIF work.
Roadmap
| Version | Themes | Status |
|---|---|---|
| 0.5.x | Engine re-architecture, Slop Index, framework support | Shipped |
| 0.6.x | Constitution, Architecture Consistency, AI Security Risk, design-token enforcement | Shipped |
| 0.7.0 | Constitution rename + forbidden deny-list, pr subcommand, Test / Business-Logic / Patterns subcommands |
Shipped 2026-06-25 |
| 0.8.0 | docs (Doc Drift), db (Database Health, Postgres-static), maintenance-cost ($/month categorical) |
Shipped 2026-07-15 |
| 0.9.0 | Repository Coherence Scanner reframe, default-off INVERTED + NOISY rules, expanded slop_suggest, new slop_governance MCP tool |
Shipped 2026-08-15 |
| 0.10 | Credibility milestone: per-rule P/R/FPR + peer-reviewed thresholds (Halstead, McCabe, Hindle, Rissanen, Kullback-Leibler); MDL composite replaces heuristic weights | In flight — see docs/research/v0.10-implementation-plan.md |
| 1.0 | Stability commitment — 6+ months post-v0.10 empirical feedback; freezes the surface, no new scores | Far horizon |
Per-version research notes:
- Phase 6 — Doc Drift
- Phase 8 — DB Health
- Memo #4 — AI Maintenance Cost
- Math foundations — peer-reviewed methods for v0.9.3+ rules
- v0.10 implementation plan — credibility milestone roadmap
Installation
Run once without installing:
npx slopbrickAdd to a project as a dev dependency:
pnpm add -D slopbrickQuick start
Initialize a config in the project root:
npx slopbrick initScan the current workspace:
npx slopbrick scanOr scan specific paths:
npx slopbrick scan src appOn first run, slopbrick auto-detects your framework, styling solution, UI libraries (Tailwind, Tamagui, shadcn/ui, MUI, etc.), and workspace packages. Framework presets automatically disable or downgrade rules that are idiomatic for React Native, Expo, or Tamagui.
Don't want to write a config from scratch?
Four ready-to-use starter configs live in examples/:
examples/basic/— sensible defaults for most projectsexamples/strict/— CI gating withnoIncreasebaselineexamples/monorepo/— pnpm/turbo workspacesexamples/ci/— JSON + SARIF output for code-scanning upload
cp examples/strict/slopbrick.config.mjs ./slopbrick.config.mjs
npx slopbrick validate-config # check it before running a scanSee examples/README.md for the full walkthrough.
Configuration
Config lives at slopbrick.config.mjs in the project root. It is an ES module that exports a default object.
export default {
include: ['src/**/*', 'app/**/*', 'pages/**/*', 'components/**/*'],
exclude: [
'**/node_modules/**',
'**/*.test.{ts,tsx,js,jsx}',
'**/*.stories.{ts,tsx}',
'**/.next/**',
'**/dist/**',
'**/build/**',
'**/coverage/**',
],
// Per-category weight multiplier
categoryWeights: {
visual: 1.2,
logic: 1.0,
perf: 0.8,
typo: 0.5,
wcag: 1.0,
layout: 1.0,
component: 1.0,
arch: 1.0,
security: 1.0,
},
// CI threshold (Phase 2 §10: composite Slop Index only)
thresholds: {
meanSlop: 30,
},
// Rule severity overrides.
// 'auto' keeps the rule's natural severity; 'off' disables it.
rules: {
'visual/inline-style': 'auto',
'visual/hardcoded-color': 'low',
'logic/style-sheet-avoidance': 'medium',
},
// Boost or reduce scores for specific frameworks
frameworkMultipliers: {
astro: 0.8,
},
// Phase 2 §10: brick.config.json import paths. Defaults to common
// shadcn-style paths. Imports from `@/components/*` not matching
// these prefixes are flagged by `context/import-path-mismatch`.
allowedImports: [
'@/components/ui/',
'@/components/',
'@/lib/',
'@/hooks/',
],
};Key options
| Option | What it does |
|---|---|
include |
File patterns to scan (default: all source files) |
exclude |
File patterns to skip |
categoryWeights |
Make certain issue types count more or less |
thresholds |
CI gates — see "Thresholds" below |
rules |
Turn specific rules off or change their severity |
frameworkMultipliers |
Boost/reduce scores for specific frameworks |
arbitraryValueAllowlist |
Tailwind values that are OK to use |
allowedImports |
brick.config.json import paths (Phase 2 §10) |
wcag |
Accessibility-specific settings |
Composite Slop Index (Phase 2 §10)
slopbrick produces a single composite score that prioritizes structural integrity over minor visual escapes:
S = (0.40 × S_boundary) + (0.35 × S_context) + (0.25 × S_visual)
Each subscore is min(100, severityPoints / componentCount), where severityPoints is the sum of severity weights for issues in that bucket.
Bucket weights:
| Bucket | Weight | What it measures |
|---|---|---|
| Boundary | 40% | Structural integrity: file-size limits, multiple components per file, direct API calls in UI |
| Context | 35% | Prop correctness, imports, state management |
| Visual | 25% | CSS, layout, typography, accessibility |
Rule → Subscore mapping:
- Boundary (40%):
logic/boundary-violation,component/giant-component,component/multiple-components-per-file - Context (35%):
component/shadcn-prop-mismatch,arch/astro-island-leak,context/import-path-mismatch, mostlogic/* - Visual (25%): all
visual/*,layout/*,typo/*,wcag/*,perf/*
CLI reference
Usage: slopbrick [options] [command]
Options:
-V, --version output the version number
--framework <name> framework multiplier to apply
--include <glob> include pattern (repeatable)
--exclude <glob> exclude pattern (repeatable)
--ai-only only report AI-specific issues
--human-only only report human-facing issues
--ignore-wcag22 ignore WCAG 2.2 related issues
--format <pretty|json|sarif|html> output format (default: "pretty")
--threads <n> number of worker threads
--since <ref> only scan files changed since git ref
--workspace <path> workspace/project path
--tighten tighten baseline allowances
--fix apply auto-fixes
--dry-run preview fixes without writing any files
--diff print unified diff of proposed auto-fixes
--doctor run diagnostics
--watch watch files and re-run
--suggest print remediation advice
--heatmap print migration ROI heatmap
--quiet suppress non-error output
--strict exit 2 if any high-severity issue remains
--no-increase exit 2 if slop index increased since last run
--auto-disable-noisy-rules downgrade rules whose measured precision < 0.5
or recall < 0.1 by one severity step
--baseline save a baseline after this scan
--trend [n] print a sparkline of the last n runs
--json [path] write JSON report to path or stdout
--html [path] write HTML report to path or stdout
--staged scan only staged files
--changed scan working-tree changes
--incremental skip unchanged files via content-hash cache
--cache-path <path> path to the incremental cache (default: .slop-audit-cache.json)
--tokens <path> merge tokens.json layout values into the
arbitrary-value allowlist
--cache cache parsed AST results locally
--rule <ruleId> run a single rule by id, skip all others
-h, --help display help for command
Commands:
init [options] create a slopbrick config file
install install the git pre-commit hook
uninstall uninstall the git pre-commit hook
badge print a shields.io slop-index badge
suggest print remediation advice
flywheel [options] summarize aggregated scan telemetry
scan [paths...] scan files for slop
explain <ruleId> print rationale, pattern, and remediation for a single rule
tokens <path> ingest a W3C DTCG tokens.json and summarize it by category
report <path> re-render a saved JSON report (from --json path.json)
doctor check your setup, config, and environment for common problems
rules [--category <name>] [--ai-only] [--json]
list all built-in rules with descriptions
mcp run an MCP server (for AI agents)
help [command] display help for command
pr — score a pull request
slopbrick pr runs the engine over the files changed between two
git refs and returns a single weighted slop score. The score is
sum(SEVERITY_WEIGHTS[issue.severity]) + constitution_violations
per file, summed across the diff. With the default threshold of
20 (configurable via prScoreThreshold or --threshold), a PR
can introduce roughly 4 medium-severity issues before it fails.
slopbrick pr [--base <ref>] [--head <ref>]
[--format text|json|markdown]
[--threshold <n>] [--max-files <n>]Defaults: --base main (falls back to master then the first
commit), --head HEAD, --format text, --threshold 20,
--max-files 500. The diff is computed with three-dot syntax
(git diff --name-only base...head), which matches GitHub's PR
view (merge-base comparison).
$ slopbrick pr
PR score: 4 (threshold: 20) — PASS
Base: main Head: HEAD
Files changed: 1
src/store.ts issues=1 constitution=1 score=4
[medium ] security/public-admin-route — line 1
[forbidden] Constitution violation: … imports 'redux' (canonical: 'redux').
────────────────────────────────────────────
PR score: 4 / 20 threshold — PASS
Use it as a CI gate:
- run: npx slopbrick pr --threshold 10
# exits 1 when PR score > 10Exit codes
| Code | Meaning |
|---|---|
0 |
Composite Slop Index under meanSlop threshold |
1 |
Composite Slop Index exceeds threshold |
2 |
High-severity issues with --strict, score regression with --no-increase, or hook install failure |
3 |
Unexpected scan error (parser crash, worker failure after retries) |
Example terminal output
$ npx slopbrick scan
Scanned 312 files, 501 components, 1423 issues (high: 48, medium: 321, low: 1054)
Slop Index: 14.2 / 100 [PASS]
(Phase 2 §10 composite: 0.40 × Boundary + 0.35 × Context + 0.25 × Visual)
├─ Boundary Slop: 12.5 (Weighted: 5.0)
├─ Context Slop: 4.0 (Weighted: 1.4)
└─ Visual Slop: 31.0 (Weighted: 7.8)
Top offending components
72.3 src/app/(tabs)/keepsakes.tsx
65.1 src/app/(tabs)/search.tsx
58.0 src/app/child/[id]/edit.tsx
...
Thresholds
Composite Slop Index 14.2 ≤ 30 pass
All thresholds passed.
Issues (1423)
[HIGH ] logic/boundary-violation · src/app/(tabs)/keepsakes.tsx:91:22
Data layer mixed with UI component
→ Move fetch/state into a server action or hook.
How scoring works
Severity weights
| Severity | Weight |
|---|---|
| high | 5 |
| medium | 3 |
| low | 1 |
(The critical tier was removed during the scoring-model refactor to prevent scoring inflation.)
Per-file scoring
For each file, the engine:
- Parses the source (SWC for TS/JS, regex for HTML, dedicated parsers for Vue/Svelte/Astro).
- Walks the AST to extract facts: imports, JSX elements, class names, inline styles, hooks, state bindings, etc.
- Runs each of the 42 registered rules against the facts.
- Each rule returns 0+ issues with severity, line, column, and optional fix suggestions.
Project scoring (Phase 2 §10)
- Bucket every issue into one of three subscores (boundary, context, visual) using the rule-to-bucket map.
- Sum severity weights per bucket:
bucketPoints[b] = Σ SEVERITY_WEIGHTS[issue.severity]. - Normalize:
subscore[b] = min(100, bucketPoints[b] / componentCount × 100). - Composite:
slopIndex = 0.40 × boundary + 0.35 × context + 0.25 × visual. - Health:
assemblyHealth = max(0, 100 - slopIndex).
Threshold
A single threshold (meanSlop) gates the exit code: slopIndex > meanSlop → exit 1.
Architecture
High-level pipeline
┌────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ CLI bin/ │───▶│ discover │───▶│ parser │───▶│ visitor │───▶│ rules │
│ slopbrick │ │ discover │ │ engine/ │ │ engine/ │ │ rules/ │
│ .js │ │ .ts │ │ parser.ts│ │ visitor.ts│ │ *.ts │
└────────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
│
▼
┌────────────┐ ┌──────────────┐ ┌────────────────┐ ┌─────────────────┐
│ Output │◀───│ aggregate │◀───│ report │◀───│ ProjectReport │
│ report/ │ │ metrics │ │ ProjectReport│ │ (per-issue) │
│ *.ts │ │ metrics.ts │ │ │ │ │
└────────────┘ └──────────────┘ └────────────────┘ └─────────────────┘
End-to-end flow (slopbrick scan)
CLI entry (
bin/slopbrick.js)Loads
dist/index.js(the bundled CLI), parses command-line flags with Commander, and resolvescwdfrom--workspaceorprocess.cwd().Config loading (
src/config.ts)Walks up from cwd looking for
slopbrick.config.{js,mjs,cjs,ts}, merges the user config withDEFAULT_CONFIG(which setsthresholds,rules,allowedImports), and validates the merged result against the schema.File discovery (
src/discover.ts)Uses
globbyto expandincludepatterns andminimatchto applyexclude. For files without an extension, it reads the first 512 bytes and sniffs the content type (TSX/TS/JSX/JS/Vue/Svelte/Astro/HTML). It de-duplicates by basename when both extension-less and proper-extension versions exist, then filters bySOURCE_EXTENSIONS(.ts,.tsx,.js,.jsx,.vue,.svelte,.astro,.html).Per-file scanning (
src/engine/worker.ts,src/engine/parser.ts,src/engine/visitor.ts)Runs in worker threads (configurable via
--threads).The parser dispatches based on file extension: SWC for TS/JS, dedicated handlers for Vue/Svelte/Astro, regex for HTML. Extension-less files try TSX → TS → JSX → JS in order.
The visitor walks the AST and extracts a
ScanFactssummary, including:imports[]—{source, importedNames, line, column}interactiveElements[]— JSX<button>,<a>,<input>, etc. with attributesstaticClassNames[]— className string literalsstyleProps[]— inlinestyle={{...}}propscomponentSizes[]— per-component line count + JSX branch countpropBindings[],stateBindings[],hooks[],logicalExpressions[], etc.
Rule execution iterates the registered rules (built-ins + user overrides) and calls each rule's
analyze(facts, context)method, collectingIssue[].Aggregation (
src/engine/metrics.ts)Sums severity points per subscore bucket (boundary / context / visual), normalizes each bucket by component count capped at 100, and computes
slopIndex = 0.40 × boundary + 0.35 × context + 0.25 × visual. Returns the structuredProjectReportwith all subscores, severity counts, and per-file scores.Threshold check (
src/cli/threshold.ts)Calls
thresholdExceeded(report, config), which comparesreport.slopIndexagainstconfig.thresholds.meanSlop. Returns true → exit code 1. Also checks per-category thresholds (categoryThresholds) if configured.Output rendering (
src/report/)One module per format:
pretty.ts— human-readable terminal output with the composite breakdown treejson.ts— serializedProjectReport(full data)sarif.ts— SARIF 2.1.0 for IDE/editor integrationhtml.ts— self-contained HTML report with score cardsmarkdown.ts— Markdown report for PR commentsheatmap.ts— migration ROI heatmap (top files by score)unified-diff.ts— unified diff of the reportadvice.ts— remediation suggestionsflywheel.ts— telemetry summary
File layout
src/
├── index.ts # Public facade (re-exports from ./cli/)
├── config.ts # Public config facade (re-exports from ./config/)
├── config/ # config/{defaults,presets,detect,load,init}
├── cli/ # CLI surface (Commander wiring + scan + init engines)
│ ├── program.ts # runCli — Commander setup, per-command .action() callbacks
│ ├── scan.ts # runScan, scanProject, watchProject, renderOutput
│ ├── init.ts # runInitWizard, runDoctor, init prompts
│ ├── options.ts # CLI option parsers (parseThreads, collectGlob, …)
│ ├── render.ts # colorForSlop, formatBadge, formatSparkline, …
│ └── threshold.ts # thresholdExceeded, stagedGating, filterIssues, …
├── engine/
│ ├── parser.ts # SWC/Vue/Svelte/Astro/HTML dispatch + extension-less fallback
│ ├── visitor.ts # AST walker → ScanFacts extraction (1313 lines — largest file)
│ ├── worker.ts # Per-file scan worker thread
│ ├── metrics.ts # Composite Slop Index aggregation
│ ├── logger.ts # Test-aware logging
│ ├── pool.ts # WorkerPool with work-stealing + retry
│ ├── executor.ts # Inline scan path for small file counts
│ ├── cache.ts # .slop-audit-cache.json + baseline.json
│ ├── memory.ts # run-history.json (--trend, --no-increase)
│ ├── telemetry.ts # Flywheel payloads
│ └── trend.ts # --trend sparkline builder
├── rules/ # Rule modules (42 built-in rules across 9 categories)
│ ├── arch/ # 1 rule — astro-island-leak
│ ├── component/ # 3 rules — giant-component, multiple-components-per-file,
│ │ shadcn-prop-mismatch
│ ├── context/ # 1 rule — import-path-mismatch
│ ├── layout/ # 4 rules — gap-monopoly, math-element-uniformity,
│ │ math-grid-uniformity, spacing-grid
│ ├── logic/ # 11 rules — boundary-violation, ghost-defensive,
│ │ key-prop-missing, math-any-density,
│ │ math-console-log-storm, math-gini-class-usage,
│ │ math-variable-name-entropy, optimistic-no-rollback,
│ │ qwik-hook-leak, reactive-hook-soup, zombie-state
│ ├── perf/ # 2 rules — cls-image, css-bloat
│ ├── typo/ # 5 rules — calc-fontsize, calc-raw-px, clamp-offscale,
│ │ math-button-label-uniformity, math-cta-vocabulary
│ ├── visual/ # 11 rules — arbitrary-escape, clamp-soup, generic-centering,
│ │ inline-style-dominance, math-color-cluster,
│ │ math-default-font, math-font-entropy,
│ │ math-gradient-hue-rotation, math-rounded-entropy,
│ │ math-spacing-entropy
│ ├── wcag/ # 4 rules — dragging-movements, focus-appearance,
│ │ focus-obscured, target-size
│ ├── builtins.ts # Auto-generated registry (pnpm generate:rules)
│ ├── rule.ts # createRule + RuleDefinition types
│ ├── registry.ts # RuleRegistry (loadBuiltins, loadProjects)
│ ├── registry-loader.ts # shadcn/ui registry snapshot cache
│ ├── project.ts # Project-level rules (runProjectRules)
│ ├── signal-strength.ts # --show-signal-strength lookup
│ └── signal-strength.json # Per-rule precision/recall measurements
├── report/ # Output formatters (pretty, json, sarif, html, …)
│ ├── pretty.ts, json.ts, sarif.ts, html.ts, markdown.ts
│ ├── advice.ts # --suggest output
│ ├── unified-diff.ts # --diff output
│ ├── heatmap.ts # --heatmap output
│ ├── flywheel.ts # flywheel summary
│ └── html/ # html/{utils,sections,static}.ts
├── fix/ # Auto-fix codemods
│ ├── index.ts # applyFixes orchestrator
│ ├── visual-codemod.ts # Round-20 visual codemods entry point
│ └── visual-codemods/ # tailwind.ts, jsx.ts, source.ts
├── snippet.ts # AI agent rule snippet generators (facade)
├── snippet/ # snippet/{data,render,generators,targets}
├── flywheel.ts # Flywheel state machine
├── mcp/ # MCP server (src/mcp/server.ts + tools)
├── research/ # research/generate, analyze, candidates, calibrate
├── config-validation.ts # Static config schema validator
├── discover.ts # File discovery + extension sniffing
├── git.ts # --staged / --changed / --since git integration
├── installer.ts # install/uninstall git pre-commit hook
├── explain.ts # `slopbrick explain <ruleId>` output
├── tokens.ts # W3C DTCG tokens.json parser
├── types.ts # All public types (ProjectReport, ScanFacts, Issue, …)
└── bin/ # bin/slopbrick.js entry point
MCP server (for AI agents)
slopbrick ships a Model Context Protocol server so AI coding agents can call it directly:
slopbrick mcp # JSON-RPC 2.0 over stdioExposes three tools:
| Tool | Args | Returns |
|---|---|---|
slop_scan_file |
{path, framework?} |
issues + Slop Index for one file |
slop_explain_rule |
{ruleId} |
rule metadata + rationale + file location |
slop_list_rules |
{category?} |
all rules with category / severity / aiSpecific |
Add to your MCP client config:
{
"mcpServers": {
"slopbrick": {
"command": "npx",
"args": ["slopbrick", "mcp"],
"cwd": "/path/to/your/project"
}
}
}AI agent rule snippets
Generate directive snippets that teach your AI agent the slop rules BEFORE it writes code:
slopbrick init --matrix # print the matrix table
slopbrick init --yes --agents-md # Codex / opencode / Pi / Cline / Gemini
slopbrick init --yes --claude-md # Claude Code
slopbrick init --yes --all # all targets at once| Flag | File | Agent |
|---|---|---|
--cursor |
.cursor/rules/slopbrick.mdc |
Cursor (new format) |
--cursorrules |
.cursorrules |
Cursor (legacy format, deprecated) |
--agents-md |
AGENTS.md |
OpenAI Codex / opencode / Pi / Cline / Continue / Gemini |
--claude-md |
CLAUDE.md |
Claude Code (takes precedence over AGENTS.md) |
--aider |
CONVENTIONS.md |
Aider |
--windsurf |
.windsurfrules |
Windsurf (Cascade) |
--cline |
.clinerules/AGENTS.md |
Cline (folder-based) |
--gemini |
.gemini/GEMINI.md |
Gemini CLI |
--copilot |
.github/copilot-instructions.md |
GitHub Copilot |
Content is generated live from the rule registry — always matches what slopbrick actually checks.
Adding new rules
Rule modules live in src/rules/<category>/<rule>.ts. Each module must export a const ending in Rule and a matching default export:
import { createRule } from '../rule';
import type { Issue, Rule, RuleContext, ScanFacts } from '../../types';
export const myRule = createRule<RuleContext>({
id: 'category/my-rule',
category: 'visual',
severity: 'medium',
aiSpecific: true,
description: 'Short one-line description used in `slopbrick rules` output.',
create(context) {
return context;
},
analyze(_context, facts: ScanFacts): Issue[] {
const issues: Issue[] = [];
// ... analyze facts ...
return issues;
},
});
export default myRule satisfies Rule<RuleContext>;Run pnpm generate:rules to regenerate src/rules/builtins.ts. This runs automatically before pnpm build and pnpm test.
Calibration against held-out human code
The tool ships with an automated calibration test (tests/integration/calibration.test.ts) that scans both corpora and asserts every AI-signal rule has a recall/FP ratio ≥ its threshold. It runs as part of pnpm test and exits non-zero on regression.
The corpora live at /Users/cheng/ai-slop-baseline/extracted/:
positive/— 6,142 AI-generated samples (vibe-coded React apps).negative/— 54,980 human-written samples (shadcn/ui, calcom, dub, mantine, excalidraw, lobehub, etc.).
A rule with a recall/FP ratio above 1.0× is a useful AI tell. A ratio below 1.0× is an anti-signal — tighten it, scope-restrict it, or drop it.
Glossary
- Slop Index — 0–100 composite score per Phase 2 §10. Lower is better. Weighted average of boundary (40%), context (35%), and visual (25%) subscores.
- Assembly Health — Inverse of Slop Index. Higher is better.
- Composite Slop Index — Phase 2 §10's weighted three-bucket formula.
- AI-specific rule — Rule that catches patterns AI defaults to but humans rarely do (e.g.
bg-violet-500, "Get started today", badge-above-h1 layout). - General rule — Catches real bugs or code-quality issues regardless of author.
- brick.config.json — Project config (in
slopbrick.config.mjs) listing allowed import paths forcontext/import-path-mismatch. - RSC / Server component — React Server Component. Runs on the server, can't use
useState/useEffect. The fix is'use client'. - Memoization — React skips re-renders if inputs haven't changed. Inline handlers break memoization because they're new functions on every render.
- Astro island — Interactive component inside an otherwise static Astro page. Without
client:*directive, clicks won't fire. - DTCG tokens — W3C Design Token Community Group JSON format.
slopbrick tokens <path>reads these. - MCP — Model Context Protocol. JSON-RPC 2.0 over stdio for AI agent integration.
Development
pnpm install
pnpm typecheck
pnpm test
pnpm buildAdding a rule? Update tests/integration/calibration.test.ts to add a calibration entry — the corpus test will verify your rule discriminates AI from human code.