feat(agents): PR review engine — scope filter, schema, audit trail by 0xbulma · Pull Request #716 · morpho-org/sdks

0xbulma · 2026-05-21T20:05:15Z

Motivation

The in-repo .agents/ PR review automation (caller commands + shared lib/pr-review-base.md + 9 personas, anchored to AGENTS.md §10) has been useful but had structural gaps that produced reviewer noise:

Findings on lines the diff didn't touch — the scope filter only dropped by file, so agents flagging pre-existing untyped errors / missing JSDoc on unchanged lines of touched files got through.
Free-form descriptions reviewers can't act on — descriptions drifted between "this is a bug" and "consider X."
Silent drops — when the scope filter dropped a finding, the user had no signal.
Markdown documentation false-positives — agents regularly flagged example code inside .md fenced blocks.
Determinism scattered across English prose — diff math and edge cases (renames, deletion-only hunks) re-derived from English at every run.
One coarse conditional persona (ci-release-security) fired whenever any of CI / release / dependency files were touched.
No versioning on engine or personas.

TIB-2026-05-21 frames the gaps and the absorption plan, with the explicit goal of keeping the Morpho-specific persona inventory (morpho-protocol, web3-security, etc.) and AGENTS.md §X anchorage intact — the specificity that makes this reviewer useful and that an off-the-shelf generic agent set cannot replicate.

Solution

All five phases of the TIB land here (ship-them-all-in-one PR per request):

Phase 1 — version: semver frontmatter on every persona + full engine frontmatter (kind: engine, disable-model-invocation: true) on pr-review-base.md.
Phase 2 — Two TypeScript scripts (Node stdlib only — no tsx, no Python):
- .agents/lib/scripts/build-changed-lines.ts — parses git diff --unified=0 into a CHANGED_LINES map. Handles deletion-only hunks, pure renames, file deletions.
- .agents/lib/scripts/validate-findings.ts — applies WHAT/FIX schema check + file-out-of-scope drop + ±15-line tolerance window + Markdown fenced-block FP filter, emits {kept, dropped, failed, counts}.
- .agents/references/{changed-lines,scope-filter,calibration}.md — authoritative documentation for the edge cases and the tolerance rationale.
Phase 3 — WHAT: / FIX: description schema enforced; malformed routes to FAILED_AGENTS (never silently dropped). <DROPPED_FINDINGS> in the output contract; rendered as a collapsible <details> block in pr-review-gh, terminal counters in pr-review-local, and /tmp/ JSON for pr-review-ci (keeps the formal review body tight).
Phase 4 — ci-release-security split into three narrowly-triggered conditional personas mirroring §10 sub-sections: ci-security (<HAS_WORKFLOWS>), release-integrity (<HAS_RELEASE>), dependencies (<HAS_DEPS>). <HAS_CI_RELEASE> retired. <EXCLUDE_AGENTS> added to the caller→engine contract as a forward-compat hook. AGENTS.md §10 inventory tables updated; rules themselves unchanged.
Phase 5 — .agents/test/pr-review-engine.test.ts (Node stdlib only) asserts five structural invariants: persona frontmatter shape, conditional-trigger ↔ engine-flag wiring, engine frontmatter, script parse-ability, AGENTS.md §10 ↔ on-disk persona match. Wired as pnpm test:agents.

Preserved (the "specificity & anchorage"):

8 Morpho-specific baseline personas unchanged
applies: AGENTS.md §X anchorage on every persona
<HAS_PROTOCOL_SURFACE> + Step 4 protocol source-of-truth context (ABI / address / constant excerpts)
AGENTS.md §10 rules

Not adopted (out of scope for a TS SDK monorepo): React/Next/Tailwind/styling/accessibility/AI-SDK/runtime-validation personas, SessionStart rubric installer, Claude Code plugin marketplace layout.

Two TIB addenda document operational changes that surfaced during execution: dropped tsx devDep in favour of native Node 24 type stripping; deferred Phase-2 unit tests in favour of the Phase-5 invariant test + dev-time end-to-end smoke runs.

Verification

$ pnpm test:agents
✓ persona frontmatter (11 files)
✓ conditional triggers wired (3 flags)
✓ engine frontmatter (pr-review-base.md)
✓ bundled scripts parse (2 files)
✓ AGENTS.md §10 ↔ on-disk personas match (11 files)

All PR review engine invariants OK.

Biome clean on the new TS files; strict-TS typecheck clean; end-to-end smoke runs on this PR's actual diff confirm the scope filter behaves correctly (out-of-scope file → dropped, line outside ±15 → dropped with distance_to_nearest_changed_line, missing WHAT: clause → routed to failed[]).

No published-package contracts affected — no changeset required per AGENTS.md §7.

….agents/ Adopts CHANGED_LINES + ±15 line tolerance, WHAT/FIX schema, DROPPED_FINDINGS audit trail, bundled TS scripts, references/ directory, ci-release-security persona split, versioned frontmatter, and an invariant test — while preserving the Morpho-specific persona inventory and AGENTS.md §X anchorage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- .agents/lib/pr-review-base.md: full frontmatter with kind: engine, version: 0.1.0, disable-model-invocation: true. - Every .agents/personas/*.md: version: 1.0.0 (initial baseline). No behavior change. Sets up the version-tracking contract that .agents/test/pr-review-engine.test.sh (Phase 5) will assert on. TIB: TIB-2026-05-21 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eferences - .agents/lib/scripts/build-changed-lines.ts: parses `git diff --unified=0` from stdin, emits CHANGED_LINES map. Handles deletion-only hunks, pure renames (empty array), and file deletions (absent from map). - .agents/lib/scripts/validate-findings.ts: applies WHAT/FIX schema check, file-out-of-scope drop, ±15-line tolerance window, and Markdown fenced-code-block FP filter in one pass. Emits {kept,dropped,failed,counts}. - .agents/references/{changed-lines,scope-filter,calibration}.md: authoritative documentation for the edge cases and the tolerance rationale. Linked from pr-review-base.md and from the per-agent envelope. - .agents/lib/pr-review-base.md: Step 3 now pipes `git diff --unified=0` into build-changed-lines.ts; Step 6.1 routes findings through validate-findings.ts; Step 5 envelope contract injects CHANGED_LINES. - biome.json: narrow `!**/lib` exclude to `!packages/**/lib` so the engine scripts under .agents/lib/scripts/ are linted (build outputs in packages/*/lib remain excluded). Both scripts: Node stdlib only, no new deps. Strict TS, NodeNext, zero `any`. Native Node 24 strips types — no tsx, no transpiler. TIB: TIB-2026-05-21 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- pr-review-base.md Step 5: per-agent contract now mandates `WHAT:` and `FIX:` clauses in every finding's `description`. Missing clauses route to `failed[]` (counted in <FAILED_AGENTS>) — never silently dropped. - pr-review-base.md Step 5: `line` schema spelled out — positive integer inside CHANGED_LINES for the file, or within ±15 lines. - pr-review-base.md Step 6.2: agent-failure list now includes the schema failures (missing WHAT/FIX, malformed shape) routed via Step 6.1. - pr-review-base.md output contract: adds <DROPPED_FINDINGS> (with drop_reason and distance_to_nearest_changed_line) and <DROPPED_COUNTS>. - pr-review-gh.md: renders dropped findings as a collapsible <details> audit-trail block when DROPPED_FINDINGS is non-empty; writes full JSON to /tmp/pr-review-gh-<PR>-dropped.json regardless. - pr-review-local.md: prints a terminal audit-trail block (counters only) between findings and sentinel; writes full JSON to /tmp/. - pr-review-ci.md: writes dropped JSON to /tmp/ for artifact upload; does NOT render in the formal GitHub review body — keeps the verdict tight. TIB: TIB-2026-05-21 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…personas ci-release-security.md was a single combined persona firing whenever any of CI / release / dependency files were touched. Split into three narrowly-triggered personas, each anchored to a discrete §10 sub-rule cluster: - ci-security.md (trigger: <HAS_WORKFLOWS>) — workflow injection, pull_request_target + PR-head checkout, ACL gating, action pinning, permissions: scopes, secret exposure. - release-integrity.md (trigger: <HAS_RELEASE>) — publish-flow integrity (--provenance, org tokens), release-commit signing & write-token hardening, Changesets / release-bot wiring. - dependencies.md (trigger: <HAS_DEPS>) — lockfile drift, new-dep hygiene (postinstall, typosquats, unpinned ranges), .npmrc and pnpm-workspace.yaml hardening. Engine prose: - pr-review-base.md Step 4: <HAS_CI_RELEASE> retired; replaced with <HAS_WORKFLOWS>, <HAS_RELEASE>, <HAS_DEPS>. - pr-review-base.md inputs table: adds <EXCLUDE_AGENTS> (forward-compat hook for future orchestrators). - pr-review-base.md Step 5: new sub-step 3 applies the EXCLUDE_AGENTS filter before launching personas in parallel. - pr-review-base.md Step 5 inventory: three new conditional rows replace the single ci-release-security row. AGENTS.md §10: - Inventory table: three rows replace the single conditional row. - Header note: persona-split + bundled-scripts + references/ directory. - Section title and trailing "Applied by personas" pointer updated to name all three personas. - The §10 rules themselves are UNCHANGED — only the inventory tables move. Cross-persona out-of-scope references in code-quality, style-conventions, test-coverage, web3-security, and pr-fix.md are updated to point at the new persona names (no rule changes). TIB: TIB-2026-05-21 (Phase 4 of 5) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Asserts five structural invariants the rest of the engine relies on: 1. Every persona file has parseable frontmatter with the required fields (name, kind, version, applies). `version` is semver. `kind` is baseline or conditional. Conditionals also declare `trigger:`. 2. Every conditional persona's `trigger:` flag is computed in pr-review-base.md Step 4. Catches the easy-to-miss drift where someone adds a new conditional persona but forgets to wire the <HAS_*> flag in the engine. 3. The engine (.agents/lib/pr-review-base.md) carries the `disable-model-invocation: true` + `kind: engine` + semver `version` frontmatter — the contract that hides it from the slash-command surface. 4. Both bundled scripts (.agents/lib/scripts/{build-changed-lines, validate-findings}.ts) exist and parse-execute under native Node. Cheap smoke test that catches obvious regressions without spawning the scripts for real-world output. 5. AGENTS.md §10 persona-inventory tables ↔ on-disk personas match in both directions. No persona orphaned in either place. Wired as `pnpm test:agents`. Bash-free, Node stdlib only, no test runner — exits non-zero on any violation. Advisory initially; flip to a blocking CI step once stable. TIB: TIB-2026-05-21 (Phase 5 of 5) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two operational updates that surfaced during execution of TIB-2026-05-21 (decisions unchanged — TypeScript scripts, deterministic scope filter, audit trail, etc. all land as originally accepted): 1. Drop tsx devDep in favour of native Node 24 type stripping. Node's built-in --experimental-strip-types (stable since 22.6) handles the scripts' subset; no transpiler dep needed. Engine prose, references, and Phase-5 test all invoke `node <script>.ts`. 2. Skip Phase-2 colocated unit tests; rely on the Phase-5 invariant test and dev-time end-to-end smoke runs. Scripts are thin CLI wrappers; adding vitest project-glob wiring for a 2-file directory wasn't worth the config surface. Follow-up TIP can add them if scripts grow. TIB: TIB-2026-05-21 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@0xbulma

Per maintainer request, remove citations to the specific external Claude Code review-engine plugin (its GitHub repo URL was named throughout). The architectural learnings stand on their own — we describe them as patterns adopted from "a separate Claude Code review-engine plugin" without naming the source repo. - Rename: TIB-2026-05-21-pull-claude-skills-learnings-... → TIB-2026-05-21-pr-review-engine-scope-filter-and-audit-trail.md. - H1 + body prose: drop the repo namedrops and URL links. - References section: remove external-repo URLs; add pointers to the in-repo artifacts the TIB produced (scripts/, references/). - @0xbulma as author handle is normal attribution and stays. Decision and content of the TIB are unchanged. Only the framing prose that identified the external repo by name was scrubbed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Group the engine, agents, references, scripts, and test under a single directory matching the Anthropic Skills convention (without going full plugin). Decision unchanged — same engine, same agents, same scope-filter contract; only the on-disk layout moves. Old → new: .agents/lib/pr-review-base.md → .agents/pr-review-engine/SKILL.md .agents/personas/ → .agents/pr-review-engine/agents/ .agents/references/ → .agents/pr-review-engine/references/ .agents/lib/scripts/ → .agents/pr-review-engine/scripts/ .agents/test/ → .agents/pr-review-engine/test/ Within the engine: - pr-review-base.md → SKILL.md - pr-review-engine.test.ts → invariants.test.ts (more accurate name — it asserts engine invariants, not a generic unit-test) Cross-reference updates: - AGENTS.md §10 backlinks (24 path occurrences) → new paths. - .agents/commands/pr-review-{ci,gh,local}.md and pr-fix.md → new paths. - Internal links in agents/*.md (was personas/) — `../../AGENTS.md` → `../../../AGENTS.md` (one extra level up). - references/*.md — `../lib/scripts/` → `../scripts/`, `../lib/pr-review-base.md` → `../SKILL.md` (now sibling under pr-review-engine/). - invariants.test.ts internal constants: PERSONAS_DIR → AGENTS_DIR, BASE_PATH → SKILL_PATH, plus updated REPO_ROOT depth and the regex that scans AGENTS.md persona-link URLs. - package.json test:agents → node .agents/pr-review-engine/test/invariants.test.ts - documentation.md persona body → references new paths. Engine frontmatter: name pr-review-base → pr-review-engine, version 0.1.0 → 0.2.0. Persona vocabulary kept in prose (matches AGENTS.md's "Applied by personas: …" backlinks); only the directory rename adopts the convention name "agents". Verification: $ pnpm test:agents ✓ persona frontmatter (11 files) ✓ conditional triggers wired (3 flags) ✓ engine frontmatter (SKILL.md) ✓ bundled scripts parse (2 files) ✓ AGENTS.md §10 ↔ on-disk personas match (11 files) Biome clean on the 3 TS files in the new location. TIB addendum (2026-05-21) added to document the post-hoc layout change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per maintainer request: every script gets a colocated test. Adds 99 cases across three vitest files, all running under the new pr-review-engine project glob in vitest.config.ts: scripts/build-changed-lines.test.ts (12 cases) - Empty diff → {} ; single + multi hunks → sorted-deduped lists. - Default count=1 when omitted in the hunk header. - Deletion-only hunks (+0) → file in map with []. - Pure renames + rename-with-content → new path captured. - File deletions (+++ /dev/null) → absent from map. - New files (post-image lines starting at 1) → captured. - Multiple files → separate keys. - DiffParseError class is exported, named, and round-trips offendingLine. scripts/validate-findings.test.ts (32 cases) - findFencedBlocks: no fence / matched ``` / matched ~~~ / unclosed / multiple / indented (≤3 spaces). - Schema rejections: missing WHAT, missing FIX, invalid severity, non-positive line, non-integer line, missing description, missing or empty file, non-object findings. All route to failed[]. - Scope filter: kept / file_out_of_scope / line_pre_existing (with distance_to_nearest_changed_line) / ±15 adjacent tolerance / pure-rename short-circuit / path normalization (./ a/ b/ prefixes). - Markdown fence FP filter: line inside fence / outside / on delimiter line / non-.md skip / missing-file fallback. - Counts: severity-by-severity tally, dropped findings excluded. - FindingsParseError class is exported and named. test/invariants.test.ts (55 cases — many test.each over 11 personas) - Converted from a custom node:test-style script to vitest. - Persona frontmatter: required fields, valid kind, semver version, conditional iff trigger. - Conditional trigger flags exist in SKILL.md Step 4 (parameterised). - Engine frontmatter: kind=engine, disable-model-invocation=true, semver version, parseable. - Bundled scripts import cleanly and expose their public surface (stronger than the prior `node --check` parse-only smoke). - AGENTS.md §10 ↔ on-disk personas: bidirectional inventory match. Plumbing: - vitest.config.ts: new `pr-review-engine` project entry. - package.json test:agents: `vitest run --project=pr-review-engine` (was: a one-shot node script invocation). Bare `pnpm test` now also picks the engine tests up automatically. TIB addendum (2026-05-21) updated: the prior "tests deferred" addendum is superseded by this one — colocation rule from AGENTS.md §5 is met. Verification: 99 / 99 passing, ~200ms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

0xbulma and others added 7 commits May 21, 2026 15:39

0xbulma added the enhancement New feature or request label May 21, 2026

0xbulma self-assigned this May 21, 2026

0xbulma changed the title ~~feat(agents): pull claude-skills learnings into PR review engine~~ feat(agents): PR review engine — scope filter, schema, audit trail May 21, 2026

0xbulma and others added 2 commits May 21, 2026 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): PR review engine — scope filter, schema, audit trail#716

feat(agents): PR review engine — scope filter, schema, audit trail#716
0xbulma wants to merge 10 commits into
mainfrom
tib-pr-review-skill-update

0xbulma commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xbulma commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0xbulma commented May 21, 2026 •

edited

Loading