feat(skills): v9 Pattern Catalog + applyPaths auto-load#808
feat(skills): v9 Pattern Catalog + applyPaths auto-load#808suryaiyer95 wants to merge 2 commits into
Conversation
Skill content (the body of this change):
- .opencode/skills/dbt-troubleshoot/SKILL.md
- .opencode/skills/dbt-troubleshoot/references/pattern-catalog.md
- .opencode/skills/dbt-develop/SKILL.md
- .opencode/skills/dbt-develop/references/pattern-catalog.md
- packages/opencode/src/session/prompt/anthropic.txt
Adds a pattern catalog with six concrete recipes for the recurring
failure shapes in ADE-Bench, plus a pre-finish hard-stop checklist:
P1: dbt_utils.surrogate_key deprecation (rename + treat_nulls var,
triggers on build warnings too)
P2: Missing periods in time-series (date spine + LEFT JOIN)
P3: Source-direct refactor (register sources.yml)
P4: Create model from column list (no formula given)
P4-extra: \"Add details\" / underspecified joins (SELECT * EXCLUDE)
P5: Package upgrade type errors (override at project level)
P6: Rolling N-day windows (warm-up NULL for first N-1 rows)
Step 5d Pre-finish Hard-Stop Checklist requires the agent to echo,
before declaring done, per-imperative coverage + row-count probe
(Step 5b) + ref/source audit (Step 5c) + ERROR=0.
Anthropic.txt adds: turn-1 TodoWrite rule, anti-dismissal rule
(never blame data/test/seed), anti-loop circuit-breaker.
Mechanism (enables auto-load of these skills):
- packages/opencode/src/skill/skill.ts: alwaysApply + applyPaths frontmatter
- packages/opencode/src/session/system.ts: collectAutoLoadedSkills
Skills with applyPaths matching files in the worktree are inlined
into the system prompt at session start. dbt-troubleshoot and
dbt-develop are auto-loaded when dbt_project.yml is present.
Benchmark results (ADE-Bench upstream snowflake, single attempt):
v6 baseline (sonnet, no catalog): 21/43 (49%)
v7 (sonnet, clean Iron Rules): 24/43 (56%)
v9 (this PR): 29/43 (67%) on common 43
48/69 (70%) on full upstream
Cost: $55.82, 107.8M tokens, 1763 turns. Caching reduced effective
cost by ~7x (cache_read ~99.998% of tokens).
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
📝 WalkthroughWalkthroughThis PR implements skill auto-loading—allowing skills to inject themselves into the system prompt when marked ChangesSkill Auto-loading with dbt Pattern Catalogs
Sequence Diagram(s)The two diagrams embedded in the review stack artifact (dbt-develop workflow and dbt-troubleshoot verification) visualize the multi-step verification and pattern-driven agent behaviors introduced by this PR. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The PR combines code changes (skill schema extension, system prompt auto-load wiring) with extensive documentation updates (two skill files and two pattern catalogs). Code complexity is moderate—glob scanning, zod schema updates, and prompt injection logic are straightforward but require careful review of the auto-load precedence and error handling. Documentation is dense and prescriptive, introducing six detailed patterns per skill with step-by-step playbooks; reviewers must verify pattern consistency between dbt-develop and dbt-troubleshoot catalogs and confirm that all patterns are actionable and non-contradictory. No major algorithmic or architectural changes; primary effort is in reading/validating the new guidance content.
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
1 similar comment
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opencode/src/session/system.ts (1)
74-80:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAuto-load matching is currently gated by env-based skill filtering.
When
env_fingerprint_skill_selectionis on, applicablealwaysApply/applyPathsskills can be excluded before auto-load runs, which breaks the auto-load contract.Suggested fix
- const autoLoaded = await collectAutoLoadedSkills(filtered) + const autoLoaded = await collectAutoLoadedSkills(list)Also applies to: 99-99
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/session/system.ts` around lines 74 - 80, The env-based filtering currently removes skills before auto-load, breaking the auto-load contract; when cfg.experimental?.env_fingerprint_skill_selection is true, ensure skills that require auto-loading are preserved by merging back any skills from the original list that have alwaysApply or applyPaths into the filtered result. Specifically, after calling selectSkillsWithLLM(list, Fingerprint.get()) (or if you instead run select first), augment the resulting "filtered" set with any entries from the original "list" where Skill.Info.alwaysApply is truthy or Skill.Info.applyPaths is non-empty so auto-load behavior remains intact while still applying env-based selection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.opencode/skills/dbt-develop/SKILL.md:
- Around line 45-53: The fenced checklist block in SKILL.md is missing a
language tag which triggers MD040; update the triple-backtick fence that wraps
the checklist to include a language identifier (e.g., add "text" so it becomes
```text) for that specific checklist block so the linter recognizes the fence
language; ensure you only modify the opening fence and leave the checklist lines
and closing fence intact.
In @.opencode/skills/dbt-troubleshoot/SKILL.md:
- Around line 34-35: Clarify that Rule 4's "stop at ERROR=0" applies to general
warnings, but exceptions listed as mandatory (e.g., P1 surrogate_key
deprecation) must still be acted on even when `dbt build` reports `ERROR=0`;
update SKILL.md to state that deprecation/migration patterns like
`surrogate_key` are treated as actionable regardless of ERROR=0 and should
trigger the required fix/migration, and add one sentence noting to document any
judgment call in a one-line SQL comment when applying such mandatory migrations.
- Around line 254-263: The fenced checklist block in SKILL.md is missing a
language tag (MD040); update the opening fence for that block (the
triple-backtick that precedes the checklist lines starting "- [imperative `#1`
from prompt] …") to include a language identifier such as text (e.g., ```text)
so the markdown linter recognizes the code fence; leave the checklist contents
unchanged and ensure the closing triple-backtick remains.
In `@packages/opencode/src/session/system.ts`:
- Around line 139-150: The glob-scan errors are being swallowed inside
anyMatchInWorktree so collectAutoLoadedSkills (the try/catch around
anyMatchInWorktree used in applyPaths and the similar block at lines 166-173)
never hits its catch and failures become silent; update anyMatchInWorktree to
stop converting Glob.scan errors into a silent empty result (either remove the
try/catch that returns [] or rethrow the caught error) so that callers like
collectAutoLoadedSkills can catch and log the real error, ensuring applyPaths
failures produce the existing autoLoadLog.warn with the actual error details.
---
Outside diff comments:
In `@packages/opencode/src/session/system.ts`:
- Around line 74-80: The env-based filtering currently removes skills before
auto-load, breaking the auto-load contract; when
cfg.experimental?.env_fingerprint_skill_selection is true, ensure skills that
require auto-loading are preserved by merging back any skills from the original
list that have alwaysApply or applyPaths into the filtered result. Specifically,
after calling selectSkillsWithLLM(list, Fingerprint.get()) (or if you instead
run select first), augment the resulting "filtered" set with any entries from
the original "list" where Skill.Info.alwaysApply is truthy or
Skill.Info.applyPaths is non-empty so auto-load behavior remains intact while
still applying env-based selection.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 605c2dfd-ea3b-4f31-83c2-df82eab47624
📒 Files selected for processing (7)
.opencode/skills/dbt-develop/SKILL.md.opencode/skills/dbt-develop/references/pattern-catalog.md.opencode/skills/dbt-troubleshoot/SKILL.md.opencode/skills/dbt-troubleshoot/references/pattern-catalog.mdpackages/opencode/src/session/prompt/anthropic.txtpackages/opencode/src/session/system.tspackages/opencode/src/skill/skill.ts
| ``` | ||
| - [imperative #1 from prompt] → [file created / column added] | ||
| - [imperative #2 from prompt] → ... | ||
| - [imperative #N from prompt] → ... | ||
| - Every named column in the spec is in the final SELECT: [yes/no — list any missing] | ||
| - Step 5b row-count probe on each created/modified model: [yes/no] | ||
| - New models registered in nearest schema.yml: [yes/no] | ||
| - Full `altimate-dbt build` reports ERROR=0: [yes/no] | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced checklist block.
This fence is missing a language identifier, which triggers MD040 and can fail markdown linting in CI.
Proposed fix
-```
+```text
- [imperative `#1` from prompt] → [file created / column added]
- [imperative `#2` from prompt] → ...
- [imperative `#N` from prompt] → ...
- Every named column in the spec is in the final SELECT: [yes/no — list any missing]
- Step 5b row-count probe on each created/modified model: [yes/no]
- New models registered in nearest schema.yml: [yes/no]
- Full `altimate-dbt build` reports ERROR=0: [yes/no]</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 45-45: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.opencode/skills/dbt-develop/SKILL.md around lines 45 - 53, The fenced
checklist block in SKILL.md is missing a language tag which triggers MD040;
update the triple-backtick fence that wraps the checklist to include a language
identifier (e.g., add "text" so it becomes ```text) for that specific checklist
block so the linter recognizes the fence language; ensure you only modify the
opening fence and leave the checklist lines and closing fence intact.
</details>
<!-- fingerprinting:phantom:triton:hawk -->
<!-- 4e71b3a2 -->
<!-- This is an auto-generated comment by CodeRabbit -->
| 4. **Done = green build, nothing more.** When the originally-reported error is resolved AND `dbt build` reports `ERROR=0`, the task is complete. Do not refactor unrelated models, clean up YAML warnings, remove orphaned doc entries, or add tests/docs that the user did not ask for. Warnings ≠ errors. Stop when the build is green. | ||
| 5. **Decide and act — never pause to ask the user.** When the data or the prompt is ambiguous (duplicate keys, multiple plausible interpretations, missing context about a business rule), you do not have an interactive user to consult — the original request is the only message you will receive. Make the most defensible call from what you can see: the prompt's explicit constraints first, then the project's existing patterns, then the data's actual shape (`column-values`, `count(*)`, `min/max`). Document the assumption in a one-line comment in the SQL if it's truly judgmental. Do **not** write "I'll ask the user" or "should I…" or "let me know if…" — those phrases waste the entire trial. The job is to ship a working, defensible fix; ambiguity gets resolved by you, not by waiting. |
There was a problem hiding this comment.
Resolve conflict between “warnings ≠ errors” and warning-triggered mandatory patterns.
Rule 4 says to stop at ERROR=0 and treat warnings as non-actionable, while P1 explicitly treats surrogate_key warnings as mandatory migration triggers. This ambiguity can cause the agent to skip required deprecation fixes in warning-only runs.
Proposed clarification
-4. **Done = green build, nothing more.** ... Warnings ≠ errors. Stop when the build is green.
+4. **Done = green build, nothing more (except mandatory pattern-catalog warning migrations).** ...
+General warnings are non-blocking, but warnings that match a mandatory pattern recipe (e.g., P1 surrogate_key deprecation) must be fully applied before done.Also applies to: 49-50
🧰 Tools
🪛 LanguageTool
[style] ~35-~35: Consider using polite language here.
Context: ... "I'll ask the user" or "should I…" or "let me know if…" — those phrases waste the entire t...
(INSERT_PLEASE)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.opencode/skills/dbt-troubleshoot/SKILL.md around lines 34 - 35, Clarify
that Rule 4's "stop at ERROR=0" applies to general warnings, but exceptions
listed as mandatory (e.g., P1 surrogate_key deprecation) must still be acted on
even when `dbt build` reports `ERROR=0`; update SKILL.md to state that
deprecation/migration patterns like `surrogate_key` are treated as actionable
regardless of ERROR=0 and should trigger the required fix/migration, and add one
sentence noting to document any judgment call in a one-line SQL comment when
applying such mandatory migrations.
| ``` | ||
| - [imperative #1 from prompt] → [file you edited / config you set] | ||
| - [imperative #2 from prompt] → [file you edited / config you set] | ||
| - [imperative #N from prompt] → [file you edited / config you set] | ||
| - Pattern matched from catalog: [P1/P2/P3/P4/P5 or "none"] | ||
| - If pattern matched, all recipe steps applied: [yes/no] | ||
| - Step 5b row-count probe on each named model: [yes/no — list models] | ||
| - Step 5c ref/source audit (if refactor): [yes/no/n.a.] | ||
| - Full `altimate-dbt build` reports ERROR=0: [yes/no] | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the checklist fenced block.
This fence is missing a language identifier (MD040).
Proposed fix
-```
+```text
- [imperative `#1` from prompt] → [file you edited / config you set]
- [imperative `#2` from prompt] → [file you edited / config you set]
- [imperative `#N` from prompt] → [file you edited / config you set]
- Pattern matched from catalog: [P1/P2/P3/P4/P5 or "none"]
- If pattern matched, all recipe steps applied: [yes/no]
- Step 5b row-count probe on each named model: [yes/no — list models]
- Step 5c ref/source audit (if refactor): [yes/no/n.a.]
- Full `altimate-dbt build` reports ERROR=0: [yes/no]</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 254-254: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.opencode/skills/dbt-troubleshoot/SKILL.md around lines 254 - 263, The
fenced checklist block in SKILL.md is missing a language tag (MD040); update the
opening fence for that block (the triple-backtick that precedes the checklist
lines starting "- [imperative `#1` from prompt] …") to include a language
identifier such as text (e.g., ```text) so the markdown linter recognizes the
code fence; leave the checklist contents unchanged and ensure the closing
triple-backtick remains.
| try { | ||
| const matched = await anyMatchInWorktree(globs) | ||
| if (matched) { | ||
| out.push(skill) | ||
| autoLoadLog.info("skill auto-loaded by applyPaths", { | ||
| skill: skill.name, | ||
| globs, | ||
| }) | ||
| } | ||
| } catch (err) { | ||
| autoLoadLog.warn("applyPaths glob scan failed", { skill: skill.name, err }) | ||
| } |
There was a problem hiding this comment.
Glob scan failures are swallowed before they can be logged.
anyMatchInWorktree catches Glob.scan errors and returns an empty list, so the warning path in collectAutoLoadedSkills is bypassed and applyPaths failures become silent false negatives.
Suggested fix
- const matches = await Glob.scan(g, {
+ const matches = await Glob.scan(g, {
cwd: root,
absolute: true,
include: "file",
dot: false,
symlink: false,
- }).catch(() => [] as string[])
+ })
if (matches.length > 0) return trueAlso applies to: 166-173
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/opencode/src/session/system.ts` around lines 139 - 150, The
glob-scan errors are being swallowed inside anyMatchInWorktree so
collectAutoLoadedSkills (the try/catch around anyMatchInWorktree used in
applyPaths and the similar block at lines 166-173) never hits its catch and
failures become silent; update anyMatchInWorktree to stop converting Glob.scan
errors into a silent empty result (either remove the try/catch that returns []
or rethrow the caught error) so that callers like collectAutoLoadedSkills can
catch and log the real error, ensuring applyPaths failures produce the existing
autoLoadLog.warn with the actual error details.
There was a problem hiding this comment.
Pull request overview
This PR introduces a dbt-focused “Pattern Catalog” for the dbt-troubleshoot and dbt-develop skills, and adds an applyPaths / alwaysApply mechanism so relevant skills can be auto-inlined into the system prompt when matching files exist in the project worktree.
Changes:
- Added
alwaysApply/applyPathsfrontmatter fields to the skill metadata schema and skill-loading pipeline. - Implemented system-prompt auto-loading of skill bodies based on unconditional (
alwaysApply) or worktree glob matches (applyPaths). - Expanded dbt skill content with a shared pattern catalog + pre-finish checklist and added new Anthropic prompt guidance (TodoWrite-first, anti-loop, etc.).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
packages/opencode/src/skill/skill.ts |
Extends skill metadata schema to carry alwaysApply/applyPaths from frontmatter (including embedded builtin skills). |
packages/opencode/src/session/system.ts |
Adds auto-load selection + inlines matching skill bodies into the system prompt. |
packages/opencode/src/session/prompt/anthropic.txt |
Strengthens agent operating instructions (TodoWrite-first, anti-dismissal, anti-loop). |
.opencode/skills/dbt-troubleshoot/SKILL.md |
Adds applyPaths and expands troubleshooting rules, checklist, and pattern-catalog entrypoint. |
.opencode/skills/dbt-troubleshoot/references/pattern-catalog.md |
New detailed pattern catalog for common dbt failure shapes (P1–P6). |
.opencode/skills/dbt-develop/SKILL.md |
Adds applyPaths, pattern-catalog entrypoint, checklist, and additional development rules. |
.opencode/skills/dbt-develop/references/pattern-catalog.md |
New detailed pattern catalog for dbt development tasks (P1–P6). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async function anyMatchInWorktree(globs: string[]): Promise<boolean> { | ||
| // Search from worktree root so a skill that wants `dbt_project.yml` | ||
| // catches the file no matter how deep the user's cwd is. | ||
| const root = Instance.worktree | ||
| for (const g of globs) { | ||
| const matches = await Glob.scan(g, { | ||
| cwd: root, | ||
| absolute: true, | ||
| include: "file", | ||
| dot: false, | ||
| symlink: false, | ||
| }).catch(() => [] as string[]) | ||
| if (matches.length > 0) return true |
| const autoLoaded = await collectAutoLoadedSkills(filtered) | ||
| const parts: string[] = [] | ||
| if (autoLoaded.length > 0) { | ||
| parts.push( | ||
| "The following skill(s) are auto-loaded because they apply to this project.", | ||
| "Treat their content as binding guidance for any related work — you do not need to", | ||
| "invoke the Skill tool again to access them.", | ||
| ) | ||
| for (const skill of autoLoaded) { | ||
| parts.push("") | ||
| parts.push(`<auto_loaded_skill name="${skill.name}">`) | ||
| parts.push(skill.content.trim()) | ||
| parts.push(`</auto_loaded_skill>`) | ||
| } | ||
| parts.push("") | ||
| } | ||
| parts.push( | ||
| "Skills provide specialized instructions and workflows for specific tasks.", | ||
| "Use the skill tool to load a skill when a task matches its description.", | ||
| // the agents seem to ingest the information about skills a bit better if we present a more verbose | ||
| // version of them here and a less verbose version in tool description, rather than vice versa. | ||
| // altimate_change start - use filtered skill list | ||
| Skill.fmt(filtered, { verbose: true }), | ||
| // altimate_change end | ||
| ].join("\n") | ||
| ) | ||
| // altimate_change end | ||
|
|
||
| return parts.join("\n") | ||
| } | ||
|
|
||
| // altimate_change start — helpers for auto-load skill selection | ||
| const autoLoadLog = Log.create({ service: "system-prompt-autoload" }) | ||
|
|
||
| async function collectAutoLoadedSkills(list: Skill.Info[]): Promise<Skill.Info[]> { | ||
| const out: Skill.Info[] = [] | ||
| for (const skill of list) { | ||
| if (skill.alwaysApply === true) { | ||
| out.push(skill) | ||
| continue | ||
| } | ||
| const globs = normalizeApplyPaths(skill.applyPaths) | ||
| if (globs.length === 0) continue | ||
| try { | ||
| const matched = await anyMatchInWorktree(globs) | ||
| if (matched) { | ||
| out.push(skill) | ||
| autoLoadLog.info("skill auto-loaded by applyPaths", { | ||
| skill: skill.name, | ||
| globs, | ||
| }) | ||
| } | ||
| } catch (err) { | ||
| autoLoadLog.warn("applyPaths glob scan failed", { skill: skill.name, err }) | ||
| } | ||
| } | ||
| return out |
There was a problem hiding this comment.
6 issues found across 7 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/opencode/src/session/system.ts">
<violation number="1" location="packages/opencode/src/session/system.ts:166">
P2: Auto-load matching rescans the filesystem every prompt build, causing avoidable latency in the session loop.</violation>
<violation number="2" location="packages/opencode/src/session/system.ts:172">
P2: `Glob.scan` errors are swallowed by the `.catch(() => [] as string[])` here, which means `anyMatchInWorktree` will never throw. The `catch (err) { autoLoadLog.warn("applyPaths glob scan failed", ...) }` in `collectAutoLoadedSkills` is therefore dead code — glob failures silently result in skills not being auto-loaded with no warning logged. Either remove the `.catch()` so the error propagates to the caller's warning log, or log the error here directly.</violation>
</file>
<file name=".opencode/skills/dbt-troubleshoot/SKILL.md">
<violation number="1" location=".opencode/skills/dbt-troubleshoot/SKILL.md:40">
P2: Iron Rule 10 references P1-P5 but the Pattern Catalog includes P6. Update the rule to say P1-P6 so agents do not skip the P6 recipe.</violation>
<violation number="2" location=".opencode/skills/dbt-troubleshoot/SKILL.md:258">
P2: Step 5d checklist lists patterns as P1/P2/P3/P4/P5 but the catalog includes P6. Add P6 to the checklist so agents can report P6 pattern matches correctly.</violation>
</file>
<file name="packages/opencode/src/session/prompt/anthropic.txt">
<violation number="1" location="packages/opencode/src/session/prompt/anthropic.txt:18">
P1: Benchmark/evaluation-specific language leaked into the production system prompt. References to "the benchmark" and "The grader's expectations are frozen" are internal testing concepts that should not be in user-facing prompts. In production sessions there is no grader — this confuses the agent and exposes evaluation infrastructure details to end users.</violation>
</file>
<file name=".opencode/skills/dbt-develop/SKILL.md">
<violation number="1" location=".opencode/skills/dbt-develop/SKILL.md:50">
P3: The checklist references "Step 5b" but the Core Workflow only has Steps 1–4. Either rename to the correct section or remove the step number to avoid confusion for agents following the workflow.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| // catches the file no matter how deep the user's cwd is. | ||
| const root = Instance.worktree | ||
| for (const g of globs) { | ||
| const matches = await Glob.scan(g, { |
There was a problem hiding this comment.
P2: Auto-load matching rescans the filesystem every prompt build, causing avoidable latency in the session loop.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/session/system.ts, line 166:
<comment>Auto-load matching rescans the filesystem every prompt build, causing avoidable latency in the session loop.</comment>
<file context>
@@ -78,14 +82,97 @@ export namespace SystemPrompt {
+ // catches the file no matter how deep the user's cwd is.
+ const root = Instance.worktree
+ for (const g of globs) {
+ const matches = await Glob.scan(g, {
+ cwd: root,
+ absolute: true,
</file context>
| - [imperative #1 from prompt] → [file you edited / config you set] | ||
| - [imperative #2 from prompt] → [file you edited / config you set] | ||
| - [imperative #N from prompt] → [file you edited / config you set] | ||
| - Pattern matched from catalog: [P1/P2/P3/P4/P5 or "none"] |
There was a problem hiding this comment.
P2: Step 5d checklist lists patterns as P1/P2/P3/P4/P5 but the catalog includes P6. Add P6 to the checklist so agents can report P6 pattern matches correctly.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .opencode/skills/dbt-troubleshoot/SKILL.md, line 258:
<comment>Step 5d checklist lists patterns as P1/P2/P3/P4/P5 but the catalog includes P6. Add P6 to the checklist so agents can report P6 pattern matches correctly.</comment>
<file context>
@@ -155,6 +182,119 @@ altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 10
+- [imperative #1 from prompt] → [file you edited / config you set]
+- [imperative #2 from prompt] → [file you edited / config you set]
+- [imperative #N from prompt] → [file you edited / config you set]
+- Pattern matched from catalog: [P1/P2/P3/P4/P5 or "none"]
+- If pattern matched, all recipe steps applied: [yes/no]
+- Step 5b row-count probe on each named model: [yes/no — list models]
</file context>
| - Pattern matched from catalog: [P1/P2/P3/P4/P5 or "none"] | |
| - Pattern matched from catalog: [P1/P2/P3/P4/P5/P6 or "none"] |
| 7. **Audit refs and sources before declaring done.** When the task involves moving, removing, or renaming models, run the ref/source audit (Step 5c). Every `source('x', 'y')` call must have a matching entry in a `sources.yml`; every model you created must be registered in a `schema.yml`; every `ref()` to a removed model must be updated. | ||
| 8. **Never blame the data or the test.** When equality tests fail, the grader's expectation is the spec. Don't write "the seeds are inconsistent" or "this is a dataset mismatch" or "the test is wrong". Re-probe the data: cardinality, join-key overlap, NULLs. The bug is in your SQL. Always assume your model is the variable; the benchmark is frozen. | ||
| 9. **Turn 1 is TodoWrite, every time.** Before any read/glob/bash, your first tool call must be `TodoWrite` with one item per imperative sentence in the prompt (see the "Plan first" rule). Then add verification todos for Step 5b (row counts on touched models) and, if relevant, Step 5c (ref/source audit). Late TodoWrite is decorative. | ||
| 10. **Match the prompt to a pattern before acting.** Scan the pattern catalog (references/pattern-catalog.md). If the prompt matches P1-P5, follow the recipe verbatim. The recipe is mandatory in full — do not cherry-pick steps. Catalog patterns encode constraints the agent has missed before; following them changes pass rate, not following them doesn't. |
There was a problem hiding this comment.
P2: Iron Rule 10 references P1-P5 but the Pattern Catalog includes P6. Update the rule to say P1-P6 so agents do not skip the P6 recipe.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .opencode/skills/dbt-troubleshoot/SKILL.md, line 40:
<comment>Iron Rule 10 references P1-P5 but the Pattern Catalog includes P6. Update the rule to say P1-P6 so agents do not skip the P6 recipe.</comment>
<file context>
@@ -27,6 +30,30 @@ description: Debug dbt errors — compilation failures, runtime database errors,
+7. **Audit refs and sources before declaring done.** When the task involves moving, removing, or renaming models, run the ref/source audit (Step 5c). Every `source('x', 'y')` call must have a matching entry in a `sources.yml`; every model you created must be registered in a `schema.yml`; every `ref()` to a removed model must be updated.
+8. **Never blame the data or the test.** When equality tests fail, the grader's expectation is the spec. Don't write "the seeds are inconsistent" or "this is a dataset mismatch" or "the test is wrong". Re-probe the data: cardinality, join-key overlap, NULLs. The bug is in your SQL. Always assume your model is the variable; the benchmark is frozen.
+9. **Turn 1 is TodoWrite, every time.** Before any read/glob/bash, your first tool call must be `TodoWrite` with one item per imperative sentence in the prompt (see the "Plan first" rule). Then add verification todos for Step 5b (row counts on touched models) and, if relevant, Step 5c (ref/source audit). Late TodoWrite is decorative.
+10. **Match the prompt to a pattern before acting.** Scan the pattern catalog (references/pattern-catalog.md). If the prompt matches P1-P5, follow the recipe verbatim. The recipe is mandatory in full — do not cherry-pick steps. Catalog patterns encode constraints the agent has missed before; following them changes pass rate, not following them doesn't.
+11. **Echo the pre-finish checklist before declaring done.** Step 5d. The checklist forces a reread of the prompt against your edits. The most common cause of a "green build / failed test" outcome is an unaddressed imperative; the checklist surfaces it.
+
</file context>
| 10. **Match the prompt to a pattern before acting.** Scan the pattern catalog (references/pattern-catalog.md). If the prompt matches P1-P5, follow the recipe verbatim. The recipe is mandatory in full — do not cherry-pick steps. Catalog patterns encode constraints the agent has missed before; following them changes pass rate, not following them doesn't. | |
| 10. **Match the prompt to a pattern before acting.** Scan the pattern catalog (references/pattern-catalog.md). If the prompt matches P1-P6, follow the recipe verbatim. The recipe is mandatory in full — do not cherry-pick steps. Catalog patterns encode constraints the agent has missed before; following them changes pass rate, not following them doesn't. |
| include: "file", | ||
| dot: false, | ||
| symlink: false, | ||
| }).catch(() => [] as string[]) |
There was a problem hiding this comment.
P2: Glob.scan errors are swallowed by the .catch(() => [] as string[]) here, which means anyMatchInWorktree will never throw. The catch (err) { autoLoadLog.warn("applyPaths glob scan failed", ...) } in collectAutoLoadedSkills is therefore dead code — glob failures silently result in skills not being auto-loaded with no warning logged. Either remove the .catch() so the error propagates to the caller's warning log, or log the error here directly.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/session/system.ts, line 172:
<comment>`Glob.scan` errors are swallowed by the `.catch(() => [] as string[])` here, which means `anyMatchInWorktree` will never throw. The `catch (err) { autoLoadLog.warn("applyPaths glob scan failed", ...) }` in `collectAutoLoadedSkills` is therefore dead code — glob failures silently result in skills not being auto-loaded with no warning logged. Either remove the `.catch()` so the error propagates to the caller's warning log, or log the error here directly.</comment>
<file context>
@@ -78,14 +82,97 @@ export namespace SystemPrompt {
+ include: "file",
+ dot: false,
+ symlink: false,
+ }).catch(() => [] as string[])
+ if (matches.length > 0) return true
+ }
</file context>
| }).catch(() => [] as string[]) | |
| }).catch((err) => { | |
| autoLoadLog.warn("Glob.scan failed for pattern", { pattern: g, err }) | |
| return [] as string[] | |
| }) |
| - [imperative #2 from prompt] → ... | ||
| - [imperative #N from prompt] → ... | ||
| - Every named column in the spec is in the final SELECT: [yes/no — list any missing] | ||
| - Step 5b row-count probe on each created/modified model: [yes/no] |
There was a problem hiding this comment.
P3: The checklist references "Step 5b" but the Core Workflow only has Steps 1–4. Either rename to the correct section or remove the step number to avoid confusion for agents following the workflow.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .opencode/skills/dbt-develop/SKILL.md, line 50:
<comment>The checklist references "Step 5b" but the Core Workflow only has Steps 1–4. Either rename to the correct section or remove the step number to avoid confusion for agents following the workflow.</comment>
<file context>
@@ -25,6 +28,32 @@ description: Create and modify dbt models — staging, intermediate, marts, incr
+- [imperative #2 from prompt] → ...
+- [imperative #N from prompt] → ...
+- Every named column in the spec is in the final SELECT: [yes/no — list any missing]
+- Step 5b row-count probe on each created/modified model: [yes/no]
+- New models registered in nearest schema.yml: [yes/no]
+- Full `altimate-dbt build` reports ERROR=0: [yes/no]
</file context>
| - Step 5b row-count probe on each created/modified model: [yes/no] | |
| - Row-count probe on each created/modified model (see Iron Rule 10): [yes/no] |
Audited the v9 skill content and stripped phrasings that were shaped by ADE-Bench observations but would harm real-world use: 1. "Never blame the data or the test" Iron Rules (troubleshoot #8, develop #13) softened from "the grader's data is the spec" to "investigate before concluding the data is wrong." Real data CAN be corrupt; the rule now requires evidence (upstream bug, constraint violation, documented anomaly) before that conclusion, rather than treating all data as inerrant. 2. "Turn 1 is TodoWrite, every time" -> "For non-trivial tasks, plan with TodoWrite before acting." Skip TodoWrite for one-shot fixes (single edit, single grep) -- no point listing one item in a todo for a literal one-step task. 3. Removed specific frozen row counts ("6-row discrepancy on an 11000-row model") -- those numbers came from one specific benchmark trial. Replaced with generic "small absolute discrepancies on large models are still real bugs." 4. Removed verbatim ADE-Bench prompt phrase "even if the prompt only says 'the project is broken'" from P1 trigger. Replaced with generic "even when the prompt only describes a different symptom or is generic." 5. P4-extra "(and the grader)" / "the grader's expected column set" -> neutral framings about downstream consumers, tests, and BI dashboards. 6. anthropic.txt: same softening on "Never blame the data or the test" + "Plan first - TodoWrite" sections. Kept intentionally (per user direction): - "Decide and act - never pause to ask the user" (Iron Rule 5 troubleshoot, Iron Rule 9 develop, autonomous-operation in anthropic.txt). This is the autonomous-agent contract; the permission system handles destructive-op confirmation separately. All Pattern Catalog recipes (P1-P6) and the Step 5d hard-stop checklist remain -- those are general dbt engineering practice.
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
1 similar comment
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
Summary
Adds a Pattern Catalog for
dbt-troubleshootanddbt-developskills, plus theapplyPaths/alwaysApplymechanism that auto-loads skills when relevant files are present in the worktree.Benchmark impact (ADE-Bench upstream snowflake, single attempt, Sonnet 4-6):
Full upstream 69-task suite: 48 / 69 = 70%.
Notion writeup: https://www.notion.so/35f41b89292681c99edcf3c03f8fcdaa
Pattern Catalog — six concrete recipes
Each recipe encodes general dbt-engineering practice (not test-name keyed), targeting failure shapes the agent has historically applied incompletely.
dbt_utils.surrogate_keydeprecationdbt buildwarning mentions surrogate_keysurrogate_key_treat_nulls_as_empty_strings: truetodbt_project.ymlwhen any renamed call references a nullable column. Warning-triggered, not just prompt-triggered.daily_/monthly_/mom_/wow_/agg_namessource('pkg','tbl')insources.yml; remove orphan*__tmpfiles.SELECT base.*, detail.* EXCLUDE (join_keys)— don't hand-pick.dbt_packages/._28d/_7d/_Nd; "rolling N day"Step 5d — Pre-finish Hard-Stop Checklist
The agent must explicitly echo, before declaring done:
Forces a re-read of the prompt against what was actually changed.
anthropic.txtadditionsMechanism —
applyPaths/alwaysApplyauto-loadMirrors Cursor's "Always Apply" / "Auto Attached" rule modes. Skill bodies whose frontmatter matches files in the worktree are inlined into the system prompt at session start — removing the need for the agent to invoke the Skill tool first.
When a session starts in a directory containing
dbt_project.yml, bothdbt-troubleshootanddbt-developskills auto-load into context. This was the main missing piece between "we have a great skill" and "the agent actually follows it."Schema lives in
packages/opencode/src/skill/skill.ts; the loader inpackages/opencode/src/session/system.ts(collectAutoLoadedSkills,anyMatchInWorktree).Files
.opencode/skills/dbt-troubleshoot/SKILL.md.opencode/skills/dbt-troubleshoot/references/pattern-catalog.md.opencode/skills/dbt-develop/SKILL.md.opencode/skills/dbt-develop/references/pattern-catalog.mdpackages/opencode/src/session/prompt/anthropic.txtpackages/opencode/src/session/system.tspackages/opencode/src/skill/skill.tsVerified
dbt_project.ymlpresent (verified via live container)Test plan
bun run build:localproduces a working binarydbt_project.ymlworktrees (check inline<auto_loaded_skill>block in agent stream)dbt_project.yml)applyPathsfrontmatter validates without errorSummary by cubic
Adds a Pattern Catalog to
dbt-troubleshootanddbt-develop, and auto-loads these skills viaapplyPathsso they attach whendbt_project.ymlis present. This improves ADE-Bench to 29/43 (67%) on the common set and 48/69 (70%) on the full suite.New Features
dbt_packages/(override at project level), batch similar file creations, write full column lists up front, and use date spines for per-period outputs.applyPaths/alwaysApplyinSKILL.mdinlines skills at session start; auto-loaded content is placed first in the system prompt.dbt-troubleshootanddbt-developnow auto-load whendbt_project.ymlis present.packages/opencode/src/session/prompt/anthropic.txt: plan-first TodoWrite for non-trivial tasks, investigate before concluding the data/tests are wrong, an anti-loop circuit-breaker, and autonomous operation (don’t pause to ask the user mid-task).Migration
dbt_project.ymlexists.applyPathsoralwaysApplyin skill frontmatter.Written for commit 0cc4a1f. Summary will update on new commits.
Summary by CodeRabbit
New Features
Documentation