Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add dirkkok101/skills --skill "diagnose"
Install specific skill from multi-skill repository
# Description
Systematic root cause analysis for bugs and unexpected behavior. Investigates, isolates, and either fixes simple issues directly or escalates to appropriate workflow.
# SKILL.md
name: diagnose
description: Systematic root cause analysis for bugs and unexpected behavior. Investigates, isolates, and either fixes simple issues directly or escalates to appropriate workflow.
argument-hint: "[symptom or issue description]"
Diagnose: Symptom → Root Cause → Resolution
Philosophy: Understand WHAT is happening before deciding what to do. Evidence over assumption. Reproduce before theorizing. The right response might be a 5-line fix or a full redesign—diagnosis tells you which.
Core Principles
- Symptom over assumption - Verify actual behavior before theorizing
- Evidence-driven - Logs, commits, tests, reproduction steps—not guesses
- Reproduce first - If you can't reproduce it, you can't diagnose it
- Isolation before explanation - Narrow scope before root cause analysis
- Triage-oriented - Output tells you exactly what to do next
- Minimal intervention - Fix at the right level, don't over-engineer
Trigger Conditions
Run this skill when:
- Something is broken or behaving unexpectedly
- User reports a bug or error
- Tests are failing unexpectedly
- User says "this isn't working", "there's a bug", "why is this happening"
- Behavior doesn't match expectations
Do NOT use this skill for:
- New feature development → Use /brainstorm
- Performance optimization → Different investigation pattern
- Code quality improvements → Use /review
- "I want to build X" → Use /brainstorm
Critical Sequence
Phase 0: Context Gathering
Step 0.1 - Capture the Symptom:
Ask if not provided:
- "What's happening?" (actual behavior)
- "What did you expect?" (expected behavior)
- "When did this start?" (helps narrow commits)
- "Does it happen every time?" (reproducibility)
Document in working notes:
## Symptom Report
**Actual behavior:** {what's happening}
**Expected behavior:** {what should happen}
**First noticed:** {when}
**Reproducibility:** Always / Sometimes / Once
**Error messages:** {if any}
Step 0.2 - Quick Context Scan:
# Recent commits that might be relevant
git log --oneline -15
# Check for uncommitted changes
git status
# If user mentioned specific area, check recent changes there
git log --oneline -10 -- {path/to/affected/area}
Step 0.3 - Check for Known Issues:
# Check if there are existing beads for this
br search "{symptom keywords}"
# Check learnings for similar past issues
grep -r "{keywords}" docs/learnings/ 2>/dev/null || echo "No learnings folder"
Verify:
[ ] Symptom clearly documented
[ ] Recent commits reviewed
[ ] No existing work addressing this
[ ] Checked learnings for similar past issues
Phase 1: Reproduce & Verify
This is the most critical phase. Never skip it.
Step 1.1 - Attempt Reproduction:
Document exact steps:
## Reproduction Steps
### Environment
- Branch: {branch name}
- Commit: {short hash}
- OS/Platform: {if relevant}
- Configuration: {any relevant settings}
### Steps to Reproduce
1. {exact step}
2. {exact step}
3. {exact step}
### Result
- Expected: {what should happen}
- Actual: {what happens}
- Consistent: Yes / No (if no, frequency: X/10 attempts)
Step 1.2 - Verify It's Not Already Fixed:
# Check if main/master has this issue
git stash # if needed
git checkout main
# Run reproduction steps
git checkout - # back to original branch
git stash pop # if needed
Step 1.3 - Handle Non-Reproducible Issues:
If you cannot reproduce after 3 genuine attempts:
## Non-Reproducible Issue
**Attempts made:** {describe what you tried}
**Possible explanations:**
- Environment-specific (user's machine differs)
- Timing/race condition
- Data-dependent (specific input required)
- Already fixed in current code
**Recommended action:**
- [ ] Ask user for more details / screen recording
- [ ] Add logging to capture state when it occurs
- [ ] Review code for potential race conditions
- [ ] Check if user is on latest code
Present to user and ask for guidance before proceeding.
Phase 2: Evidence Collection
Launch parallel investigation tracks:
| Track | Focus | Method |
|---|---|---|
| Code Path | Trace execution flow | Read affected code, understand flow |
| History | What changed recently | git log, git blame on affected files |
| Tests | What's passing/failing | Run relevant test suite |
| Dependencies | Related components | Check what the affected code depends on |
Step 2.1 - Code Path Analysis:
Use Explore agent:
"Trace the execution path for {symptom}.
Start from {entry point} and follow through to where {unexpected behavior} occurs.
Document the flow and identify where behavior diverges from expected."
Step 2.2 - Git History Analysis:
# Blame the specific file(s) exhibiting the bug
git blame {affected_file} | head -50
# Find commits that touched this area recently
git log --oneline -10 -- {affected_path}
# If you suspect a specific commit
git show {commit_hash} --stat
Step 2.3 - Test Analysis:
# Run tests for affected area
{project_test_command} {affected_test_path}
# Check test coverage - is the buggy path tested?
# Look for missing test cases
Step 2.4 - Document Evidence:
## Evidence Collected
### Code Path
- Entry point: {where execution starts}
- Key functions: {list}
- Failure point: {where it goes wrong}
- Flow diagram (if helpful):
```
A → B → C → [FAILURE HERE] → D
```
### Git History
| Commit | Date | Author | Change Summary |
|--------|------|--------|----------------|
| {hash} | {date} | {who} | {what changed} |
**Suspect commits:** {commits that might have introduced issue}
### Test Status
- Relevant tests: {list}
- Passing: {count}
- Failing: {count}
- Missing coverage: {areas not tested}
### Dependencies
- Upstream: {what this code depends on}
- Downstream: {what depends on this code}
Phase 3: Isolation
Goal: Narrow down to the smallest reproducing case and exact fault location.
Step 3.1 - Scope Reduction:
Start broad, narrow systematically:
## Isolation Progress
### Initial Scope
- Suspected area: {broad area}
- Files involved: {count}
### Narrowing Steps
1. {what you ruled out} → Reduced scope to {remaining}
2. {what you ruled out} → Reduced scope to {remaining}
3. ...
### Isolated Fault Location
- File: {exact file}
- Function/Method: {name}
- Lines: {approximate range}
Step 3.2 - Binary Search Debugging:
If the fault location isn't obvious:
- Add logging/breakpoint at midpoint of suspected code
- Does issue occur before or after this point?
- Repeat, halving the search space each time
Step 3.3 - Minimal Reproducing Case:
Can you reproduce with:
- Fewer steps?
- Simpler input?
- Mocked dependencies?
Document the minimal case—it often reveals the root cause.
Phase 4: Root Cause Analysis
Step 4.1 - The Diagnostic 5 Whys:
Different from brainstorm's 5 Whys—this asks "why is this happening" not "why build this":
## Root Cause Analysis (5 Whys)
**Symptom:** {the bug/unexpected behavior}
1. Why does {symptom} occur?
→ Because {immediate cause}
2. Why does {immediate cause} happen?
→ Because {deeper cause}
3. Why does {deeper cause} happen?
→ Because {even deeper}
4. Why does {even deeper} happen?
→ Because {root cause emerging}
5. Why does {root cause emerging} exist?
→ Because {ROOT CAUSE}
**Root Cause:** {1-2 sentence summary}
Step 4.2 - Classify the Root Cause:
| Category | Description | Example |
|---|---|---|
| Logic Error | Code does wrong thing | Off-by-one, wrong condition |
| State Error | Unexpected state | Null, stale data, race condition |
| Integration Error | Components miscommunicate | Wrong API usage, contract violation |
| Configuration Error | Settings wrong | Wrong env var, missing config |
| Data Error | Bad input/data | Corrupt data, edge case input |
| Design Flaw | Architecture problem | Missing abstraction, wrong pattern |
Step 4.3 - Identify Contributing Factors:
Root cause is necessary but often not sufficient. What else contributed?
## Contributing Factors
| Factor | How It Contributed |
|--------|-------------------|
| {missing test} | Would have caught this |
| {unclear documentation} | Led to wrong assumption |
| {recent refactor} | Introduced the regression |
Step 4.4 - Check Learnings:
# Has this type of issue occurred before?
grep -r "{root cause keywords}" docs/learnings/
If similar learning exists, reference it. If not, this might become a new learning.
Phase 5: Triage Decision
Step 5.1 - Assess Scope:
## Scope Assessment
- [ ] **Isolated** - Single function/method, <20 lines affected
- [ ] **Localized** - Single file or tightly coupled set of files
- [ ] **Cross-cutting** - Multiple components/services affected
- [ ] **Systemic** - Architectural flaw, affects many areas
Step 5.2 - Assess Complexity:
## Fix Complexity
- [ ] **Simple** - Clear fix, one approach, minimal risk
- [ ] **Moderate** - Clear fix but touches multiple places
- [ ] **Complex** - Multiple valid approaches, needs design decisions
- [ ] **Uncertain** - Root cause unclear or fix approach unknown
Step 5.3 - Apply Triage Matrix:
| Scope | Complexity | → Action |
|---|---|---|
| Isolated | Simple | Fix-in-Place |
| Isolated | Moderate | Fix-in-Place (with care) |
| Localized | Simple | Fix-in-Place or Targeted Beads |
| Localized | Moderate | Targeted Beads |
| Localized | Complex | Design Required |
| Cross-cutting | Any | Design Required |
| Systemic | Any | Design Required |
| Any | Uncertain | More Investigation or Design Required |
Step 5.4 - Document Triage Decision:
## Triage Decision
**Scope:** {isolated/localized/cross-cutting/systemic}
**Complexity:** {simple/moderate/complex/uncertain}
**Decision:** {Fix-in-Place / Targeted Beads / Design Required}
**Rationale:**
{Why this is the right response level}
Phase 6a: Fix-in-Place (Simple Bugs)
Use this path when: Isolated scope + Simple/Moderate complexity
Step 6a.1 - Design the Fix:
## Proposed Fix
### Summary
{1-2 sentence description of the fix}
### Changes Required
| File | Change |
|------|--------|
| {file} | {what changes} |
### Why This Fixes It
{Connect fix to root cause}
### Risk Assessment
- Regression risk: Low / Medium / High
- Side effects: {any potential}
- Reversibility: Easy (can revert commit)
Step 6a.2 - Present to User:
## Fix Proposal
**Root Cause:** {summary}
**Proposed Fix:** {summary}
### Code Change
{Show the specific change with context}
**Shall I apply this fix?**
- "yes" / "apply" → Implement the fix
- "modify" → Adjust approach based on feedback
- "escalate" → This needs more design work
Step 6a.3 - Apply Fix (with approval):
- Make the code change
- Run relevant tests
- Verify the fix resolves the symptom
# Run tests to verify
{test_command}
# Verify symptom is resolved
# (reproduction steps should no longer trigger the bug)
Step 6a.4 - Post-Fix:
## Fix Applied
**Files changed:** {list}
**Tests passing:** Yes / No
**Symptom resolved:** Yes / No
### Commit Message
fix: {brief description}
Root cause: {1 sentence}
Fix: {1 sentence}
---
**Learning Opportunity?**
If this bug reveals something worth remembering:
- Pattern that prevents this class of bug
- Gotcha others might hit
- Missing test coverage
→ Offer: "This might be worth capturing. Run `/compound` to document the learning?"
Phase 6b: Targeted Beads (Medium Issues)
Use this path when: Localized scope + Moderate complexity OR clear multi-file fix
Step 6b.1 - Document Diagnostic Context:
## Diagnostic Context for Beads
### Root Cause
{Summary from Phase 4}
### Fix Approach
{High-level approach}
### Affected Files
| File | Required Change |
|------|-----------------|
| {file} | {change needed} |
### Verification
- Tests to run: {list}
- Manual verification: {steps}
Step 6b.2 - Create Beads:
Create focused beads for the fix:
## Bead: Fix {specific aspect}
**Objective:** {what this bead accomplishes}
**Context to load:**
- {file to read for context}
- This diagnostic report
**Success criteria:**
- {specific behavior fixed}
- Tests pass: {list}
**Approach:**
{Brief description of fix approach}
Step 6b.3 - Handoff:
## Ready for Execution
Created {N} beads to fix this issue.
**Next step:** Run `/execute` to implement the fix.
**Verification after execution:**
1. {reproduction steps should no longer fail}
2. {tests that should pass}
Phase 6c: Design Required (Complex Issues)
Use this path when: Cross-cutting/Systemic OR Complex/Uncertain
Step 6c.1 - Prepare Diagnostic Handoff:
The diagnosis becomes input to /brainstorm. Create a context document:
## Diagnostic Context for Design
### Problem Discovered
**Original symptom:** {what user reported}
**Root cause:** {what we found}
**Why design is needed:** {scope/complexity justification}
### Evidence Summary
**Affected areas:**
| Component | How Affected |
|-----------|--------------|
| {name} | {impact} |
**Key findings:**
- {finding 1}
- {finding 2}
### Constraints Discovered
- {constraint from investigation}
- {constraint from investigation}
### Questions for Design Phase
- {question that emerged}
- {approach decision needed}
### Relevant Learnings
| Learning | Relevance |
|----------|-----------|
| {from docs/learnings/} | {how it applies} |
Step 6c.2 - Present Handoff:
## Diagnosis Complete - Design Required
**Root Cause:** {summary}
**Why this needs design:** {reasoning}
This issue is too complex for a direct fix because:
- {reason 1}
- {reason 2}
### Recommended Next Step
Run `/brainstorm` with this context:
"{Feature/fix description based on root cause}"
The diagnostic context above will inform the design phase.
---
**Proceed?**
- "brainstorm" → Start design phase with this context
- "try fix anyway" → Attempt targeted fix (higher risk)
- "park" → Save findings for later
Phase 7: Self-Review
Before presenting findings to user, verify quality.
Diagnostic Self-Review Checklist:
## Self-Review
### Evidence Quality
- [ ] Symptom is clearly documented with actual vs expected
- [ ] Reproduction steps are complete and verified
- [ ] Git history reviewed for relevant changes
- [ ] Test status documented
### Isolation Quality
- [ ] Fault location narrowed to specific area
- [ ] Minimal reproducing case identified (if possible)
- [ ] Ruled out red herrings
### Root Cause Quality
- [ ] 5 Whys completed to genuine root cause
- [ ] Root cause explains symptom (not just correlation)
- [ ] Contributing factors identified
- [ ] Checked learnings for similar past issues
### Triage Quality
- [ ] Scope assessment is accurate
- [ ] Complexity assessment is honest
- [ ] Triage decision follows matrix
- [ ] Rationale is documented
### Fix/Handoff Quality
- [ ] (If fix) Change is minimal and targeted
- [ ] (If fix) Risk assessment completed
- [ ] (If escalation) Diagnostic context is complete
- [ ] (If escalation) Questions for next phase are clear
Working Document Structure
During diagnosis (temporary):
.diagnosis/{issue-slug}/
├── notes.md # Working notes, scratch
├── evidence/ # Screenshots, logs
└── report.md # Structured findings
After resolution:
- Delete .diagnosis/ folder
- If valuable learning: Run /compound to capture permanently
- If led to design: docs/designs/{feature}/ has permanent record
Note: Add .diagnosis/ to .gitignore if not already there.
Presentation Templates
Simple Fix Complete
## Bug Fixed
**Symptom:** {original issue}
**Root Cause:** {1 sentence}
**Fix:** {1 sentence}
### Changes Made
- {file}: {change summary}
### Verified
- [x] Reproduction steps no longer trigger bug
- [x] Tests pass
### Commit
`{commit hash}` - {commit message}
---
**Learning opportunity?** If this bug reveals a pattern worth remembering, run `/compound`.
Escalation to Beads
## Diagnosis Complete
**Symptom:** {original issue}
**Root Cause:** {summary}
**Scope:** Localized ({N} files affected)
### Fix Approach
{High-level description}
### Beads Created
| Bead | Purpose |
|------|---------|
| {id} | {objective} |
---
**Next:** Run `/execute` to implement the fix.
Escalation to Brainstorm
## Diagnosis Complete - Design Required
**Symptom:** {original issue}
**Root Cause:** {summary}
### Why Design Is Needed
{This is too complex for a direct fix because...}
### Key Findings
- {finding}
- {finding}
### Questions for Design
- {question}
- {question}
---
**Next:** Run `/brainstorm {suggested feature description}`
Diagnostic context will inform the design phase.
Quality Standards
Evidence
- Symptom documented with actual vs expected
- Reproduction verified (or non-reproducibility documented)
- Git history and test status checked
- Learnings consulted
Isolation
- Fault location narrowed systematically
- Binary search or similar technique applied
- Minimal reproducing case sought
Root Cause
- 5 Whys completed genuinely
- Root cause explains symptom
- Contributing factors identified
- Appropriate category assigned
Triage
- Scope honestly assessed
- Complexity honestly assessed
- Decision follows matrix
- Rationale documented
Resolution
- Fix is minimal (no scope creep)
- Risk acknowledged
- Verification completed
- Learning opportunity offered
Anti-Patterns
❌ Fixing symptoms, not causes
"The null check here prevents the crash"
→ But WHY is it null? That's the real bug.
✅ Fixing root causes
"The object is null because initialization is skipped when X.
Fixed the initialization logic."
❌ Guessing without evidence
"I think the bug is probably in the authentication code"
→ Based on what evidence?
✅ Evidence-driven investigation
"Git blame shows auth code changed 3 days ago.
The symptom started appearing after that commit.
Reviewing that change..."
❌ Over-engineering the fix
"While fixing this, I also refactored the entire module
and added comprehensive error handling everywhere"
✅ Minimal targeted fix
"Fixed the specific null check that caused the crash.
The broader refactor could be valuable but should be
a separate brainstorm/design effort."
❌ Skipping reproduction
"The user says it crashes, let me look at the code..."
→ How will you know when it's fixed?
✅ Reproduce first
"Reproduced the crash with these steps: ...
Now I can verify the fix actually works."
❌ Premature escalation
"This seems complicated, let's do a full design"
→ Have you actually isolated the issue?
✅ Appropriate triage
"Isolated to single function, clear fix, low risk.
This is a fix-in-place, no need for design overhead."
Exit Signals
| Signal | Meaning | Action |
|---|---|---|
| Fix applied | Simple bug resolved | Offer /compound for learning |
| Beads created | Medium fix ready | Proceed to /execute |
| Design required | Complex issue | Proceed to /brainstorm |
| Cannot reproduce | Insufficient info | Ask user for more details |
| Not a bug | Working as designed | Explain behavior to user |
| Park | Save for later | Document findings, deprioritize |
Integration Points
→ /compound
After any fix, offer learning capture:
- Gotchas discovered
- Missing test coverage identified
- Patterns that prevent this bug class
→ /execute
For targeted beads:
- Pass diagnostic context in bead descriptions
- Include verification criteria
→ /brainstorm
For design-required issues:
- Root cause becomes part of problem statement
- Evidence informs documentation foundation
- Questions identified feed into design exploration
← Learnings
Always check docs/learnings/ for similar past issues:
- Avoids re-diagnosing known problems
- Applies known solutions
- Identifies recurring patterns
Skill version: 1.0
Approach: Evidence-driven root cause analysis with adaptive triage
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.