diagnose

by @dirkkok101 in Tools

# Install this skill:

npx skills add dirkkok101/skills --skill "diagnose"

Install specific skill from multi-skill repository

# Description

Systematic root cause analysis for bugs and unexpected behavior. Investigates, isolates, and either fixes simple issues directly or escalates to appropriate workflow.

# SKILL.md

name: diagnose
description: Systematic root cause analysis for bugs and unexpected behavior. Investigates, isolates, and either fixes simple issues directly or escalates to appropriate workflow.
argument-hint: "[symptom or issue description]"

Diagnose: Symptom → Root Cause → Resolution

Philosophy: Understand WHAT is happening before deciding what to do. Evidence over assumption. Reproduce before theorizing. The right response might be a 5-line fix or a full redesign—diagnosis tells you which.

Core Principles

Symptom over assumption - Verify actual behavior before theorizing
Evidence-driven - Logs, commits, tests, reproduction steps—not guesses
Reproduce first - If you can't reproduce it, you can't diagnose it
Isolation before explanation - Narrow scope before root cause analysis
Triage-oriented - Output tells you exactly what to do next
Minimal intervention - Fix at the right level, don't over-engineer

Trigger Conditions

Run this skill when:
- Something is broken or behaving unexpectedly
- User reports a bug or error
- Tests are failing unexpectedly
- User says "this isn't working", "there's a bug", "why is this happening"
- Behavior doesn't match expectations

Do NOT use this skill for:
- New feature development → Use /brainstorm
- Performance optimization → Different investigation pattern
- Code quality improvements → Use /review
- "I want to build X" → Use /brainstorm

Critical Sequence

Phase 0: Context Gathering

Step 0.1 - Capture the Symptom:

Ask if not provided:
- "What's happening?" (actual behavior)
- "What did you expect?" (expected behavior)
- "When did this start?" (helps narrow commits)
- "Does it happen every time?" (reproducibility)

Document in working notes:

## Symptom Report

**Actual behavior:** {what's happening}
**Expected behavior:** {what should happen}
**First noticed:** {when}
**Reproducibility:** Always / Sometimes / Once
**Error messages:** {if any}

Step 0.2 - Quick Context Scan:

# Recent commits that might be relevant
git log --oneline -15

# Check for uncommitted changes
git status

# If user mentioned specific area, check recent changes there
git log --oneline -10 -- {path/to/affected/area}

Step 0.3 - Check for Known Issues:

# Check if there are existing beads for this
br search "{symptom keywords}"

# Check learnings for similar past issues
grep -r "{keywords}" docs/learnings/ 2>/dev/null || echo "No learnings folder"

Verify:

[ ] Symptom clearly documented
[ ] Recent commits reviewed
[ ] No existing work addressing this
[ ] Checked learnings for similar past issues

Phase 1: Reproduce & Verify

This is the most critical phase. Never skip it.

Step 1.1 - Attempt Reproduction:

Document exact steps:

## Reproduction Steps

### Environment
- Branch: {branch name}
- Commit: {short hash}
- OS/Platform: {if relevant}
- Configuration: {any relevant settings}

### Steps to Reproduce
1. {exact step}
2. {exact step}
3. {exact step}

### Result
- Expected: {what should happen}
- Actual: {what happens}
- Consistent: Yes / No (if no, frequency: X/10 attempts)

Step 1.2 - Verify It's Not Already Fixed:

# Check if main/master has this issue
git stash  # if needed
git checkout main
# Run reproduction steps
git checkout -  # back to original branch
git stash pop  # if needed

Step 1.3 - Handle Non-Reproducible Issues:

If you cannot reproduce after 3 genuine attempts:

## Non-Reproducible Issue

**Attempts made:** {describe what you tried}
**Possible explanations:**
- Environment-specific (user's machine differs)
- Timing/race condition
- Data-dependent (specific input required)
- Already fixed in current code

**Recommended action:**
- [ ] Ask user for more details / screen recording
- [ ] Add logging to capture state when it occurs
- [ ] Review code for potential race conditions
- [ ] Check if user is on latest code

Present to user and ask for guidance before proceeding.

Phase 2: Evidence Collection

Launch parallel investigation tracks:

Track	Focus	Method
Code Path	Trace execution flow	Read affected code, understand flow
History	What changed recently	`git log`, `git blame` on affected files
Tests	What's passing/failing	Run relevant test suite
Dependencies	Related components	Check what the affected code depends on

Step 2.1 - Code Path Analysis:

Use Explore agent:

"Trace the execution path for {symptom}.
Start from {entry point} and follow through to where {unexpected behavior} occurs.
Document the flow and identify where behavior diverges from expected."

Step 2.2 - Git History Analysis:

# Blame the specific file(s) exhibiting the bug
git blame {affected_file} | head -50

# Find commits that touched this area recently
git log --oneline -10 -- {affected_path}

# If you suspect a specific commit
git show {commit_hash} --stat

Step 2.3 - Test Analysis:

# Run tests for affected area
{project_test_command} {affected_test_path}

# Check test coverage - is the buggy path tested?
# Look for missing test cases

Step 2.4 - Document Evidence:

## Evidence Collected

### Code Path
- Entry point: {where execution starts}
- Key functions: {list}
- Failure point: {where it goes wrong}
- Flow diagram (if helpful):
  ```
  A → B → C → [FAILURE HERE] → D
  ```

### Git History
| Commit | Date | Author | Change Summary |
|--------|------|--------|----------------|
| {hash} | {date} | {who} | {what changed} |

**Suspect commits:** {commits that might have introduced issue}

### Test Status
- Relevant tests: {list}
- Passing: {count}
- Failing: {count}
- Missing coverage: {areas not tested}

### Dependencies
- Upstream: {what this code depends on}
- Downstream: {what depends on this code}

Phase 3: Isolation

Goal: Narrow down to the smallest reproducing case and exact fault location.

Step 3.1 - Scope Reduction:

Start broad, narrow systematically:

## Isolation Progress

### Initial Scope
- Suspected area: {broad area}
- Files involved: {count}

### Narrowing Steps
1. {what you ruled out} → Reduced scope to {remaining}
2. {what you ruled out} → Reduced scope to {remaining}
3. ...

### Isolated Fault Location
- File: {exact file}
- Function/Method: {name}
- Lines: {approximate range}

Step 3.2 - Binary Search Debugging:

If the fault location isn't obvious:

Add logging/breakpoint at midpoint of suspected code
Does issue occur before or after this point?
Repeat, halving the search space each time

Step 3.3 - Minimal Reproducing Case:

Can you reproduce with:
- Fewer steps?
- Simpler input?
- Mocked dependencies?

Document the minimal case—it often reveals the root cause.

Phase 4: Root Cause Analysis

Step 4.1 - The Diagnostic 5 Whys:

Different from brainstorm's 5 Whys—this asks "why is this happening" not "why build this":

## Root Cause Analysis (5 Whys)

**Symptom:** {the bug/unexpected behavior}

1. Why does {symptom} occur?
   → Because {immediate cause}

2. Why does {immediate cause} happen?
   → Because {deeper cause}

3. Why does {deeper cause} happen?
   → Because {even deeper}

4. Why does {even deeper} happen?
   → Because {root cause emerging}

5. Why does {root cause emerging} exist?
   → Because {ROOT CAUSE}

**Root Cause:** {1-2 sentence summary}

Step 4.2 - Classify the Root Cause:

Category	Description	Example
Logic Error	Code does wrong thing	Off-by-one, wrong condition
State Error	Unexpected state	Null, stale data, race condition
Integration Error	Components miscommunicate	Wrong API usage, contract violation
Configuration Error	Settings wrong	Wrong env var, missing config
Data Error	Bad input/data	Corrupt data, edge case input
Design Flaw	Architecture problem	Missing abstraction, wrong pattern

Step 4.3 - Identify Contributing Factors:

Root cause is necessary but often not sufficient. What else contributed?

## Contributing Factors

| Factor | How It Contributed |
|--------|-------------------|
| {missing test} | Would have caught this |
| {unclear documentation} | Led to wrong assumption |
| {recent refactor} | Introduced the regression |

Step 4.4 - Check Learnings:

# Has this type of issue occurred before?
grep -r "{root cause keywords}" docs/learnings/

If similar learning exists, reference it. If not, this might become a new learning.

Phase 5: Triage Decision

Step 5.1 - Assess Scope:

## Scope Assessment

- [ ] **Isolated** - Single function/method, <20 lines affected
- [ ] **Localized** - Single file or tightly coupled set of files
- [ ] **Cross-cutting** - Multiple components/services affected
- [ ] **Systemic** - Architectural flaw, affects many areas

Step 5.2 - Assess Complexity:

## Fix Complexity

- [ ] **Simple** - Clear fix, one approach, minimal risk
- [ ] **Moderate** - Clear fix but touches multiple places
- [ ] **Complex** - Multiple valid approaches, needs design decisions
- [ ] **Uncertain** - Root cause unclear or fix approach unknown

Step 5.3 - Apply Triage Matrix:

Scope	Complexity	→ Action
Isolated	Simple	Fix-in-Place
Isolated	Moderate	Fix-in-Place (with care)
Localized	Simple	Fix-in-Place or Targeted Beads
Localized	Moderate	Targeted Beads
Localized	Complex	Design Required
Cross-cutting	Any	Design Required
Systemic	Any	Design Required
Any	Uncertain	More Investigation or Design Required

Step 5.4 - Document Triage Decision:

## Triage Decision

**Scope:** {isolated/localized/cross-cutting/systemic}
**Complexity:** {simple/moderate/complex/uncertain}
**Decision:** {Fix-in-Place / Targeted Beads / Design Required}

**Rationale:**
{Why this is the right response level}

Phase 6a: Fix-in-Place (Simple Bugs)

Use this path when: Isolated scope + Simple/Moderate complexity

Step 6a.1 - Design the Fix:

## Proposed Fix

### Summary
{1-2 sentence description of the fix}

### Changes Required
| File | Change |
|------|--------|
| {file} | {what changes} |

### Why This Fixes It
{Connect fix to root cause}

### Risk Assessment
- Regression risk: Low / Medium / High
- Side effects: {any potential}
- Reversibility: Easy (can revert commit)

Step 6a.2 - Present to User:

## Fix Proposal

**Root Cause:** {summary}
**Proposed Fix:** {summary}

### Code Change
{Show the specific change with context}

**Shall I apply this fix?**
- "yes" / "apply" → Implement the fix
- "modify" → Adjust approach based on feedback
- "escalate" → This needs more design work

Step 6a.3 - Apply Fix (with approval):

Make the code change
Run relevant tests
Verify the fix resolves the symptom

# Run tests to verify
{test_command}

# Verify symptom is resolved
# (reproduction steps should no longer trigger the bug)

Step 6a.4 - Post-Fix:

## Fix Applied

**Files changed:** {list}
**Tests passing:** Yes / No
**Symptom resolved:** Yes / No

### Commit Message

fix: {brief description}

Root cause: {1 sentence}
Fix: {1 sentence}

---

**Learning Opportunity?**

If this bug reveals something worth remembering:
- Pattern that prevents this class of bug
- Gotcha others might hit
- Missing test coverage

→ Offer: "This might be worth capturing. Run `/compound` to document the learning?"

Phase 6b: Targeted Beads (Medium Issues)

Use this path when: Localized scope + Moderate complexity OR clear multi-file fix

Step 6b.1 - Document Diagnostic Context:

## Diagnostic Context for Beads

### Root Cause
{Summary from Phase 4}

### Fix Approach
{High-level approach}

### Affected Files
| File | Required Change |
|------|-----------------|
| {file} | {change needed} |

### Verification
- Tests to run: {list}
- Manual verification: {steps}

Step 6b.2 - Create Beads:

Create focused beads for the fix:

## Bead: Fix {specific aspect}

**Objective:** {what this bead accomplishes}

**Context to load:**
- {file to read for context}
- This diagnostic report

**Success criteria:**
- {specific behavior fixed}
- Tests pass: {list}

**Approach:**
{Brief description of fix approach}

Step 6b.3 - Handoff:

## Ready for Execution

Created {N} beads to fix this issue.

**Next step:** Run `/execute` to implement the fix.

**Verification after execution:**
1. {reproduction steps should no longer fail}
2. {tests that should pass}

Phase 6c: Design Required (Complex Issues)

Use this path when: Cross-cutting/Systemic OR Complex/Uncertain

Step 6c.1 - Prepare Diagnostic Handoff:

The diagnosis becomes input to /brainstorm. Create a context document:

## Diagnostic Context for Design

### Problem Discovered

**Original symptom:** {what user reported}
**Root cause:** {what we found}
**Why design is needed:** {scope/complexity justification}

### Evidence Summary

**Affected areas:**
| Component | How Affected |
|-----------|--------------|
| {name} | {impact} |

**Key findings:**
- {finding 1}
- {finding 2}

### Constraints Discovered

- {constraint from investigation}
- {constraint from investigation}

### Questions for Design Phase

- {question that emerged}
- {approach decision needed}

### Relevant Learnings

| Learning | Relevance |
|----------|-----------|
| {from docs/learnings/} | {how it applies} |

Step 6c.2 - Present Handoff:

## Diagnosis Complete - Design Required

**Root Cause:** {summary}
**Why this needs design:** {reasoning}

This issue is too complex for a direct fix because:
- {reason 1}
- {reason 2}

### Recommended Next Step

Run `/brainstorm` with this context:

"{Feature/fix description based on root cause}"

The diagnostic context above will inform the design phase.

---

**Proceed?**
- "brainstorm" → Start design phase with this context
- "try fix anyway" → Attempt targeted fix (higher risk)
- "park" → Save findings for later

Phase 7: Self-Review

Before presenting findings to user, verify quality.

Diagnostic Self-Review Checklist:

## Self-Review

### Evidence Quality
- [ ] Symptom is clearly documented with actual vs expected
- [ ] Reproduction steps are complete and verified
- [ ] Git history reviewed for relevant changes
- [ ] Test status documented

### Isolation Quality
- [ ] Fault location narrowed to specific area
- [ ] Minimal reproducing case identified (if possible)
- [ ] Ruled out red herrings

### Root Cause Quality
- [ ] 5 Whys completed to genuine root cause
- [ ] Root cause explains symptom (not just correlation)
- [ ] Contributing factors identified
- [ ] Checked learnings for similar past issues

### Triage Quality
- [ ] Scope assessment is accurate
- [ ] Complexity assessment is honest
- [ ] Triage decision follows matrix
- [ ] Rationale is documented

### Fix/Handoff Quality
- [ ] (If fix) Change is minimal and targeted
- [ ] (If fix) Risk assessment completed
- [ ] (If escalation) Diagnostic context is complete
- [ ] (If escalation) Questions for next phase are clear

Working Document Structure

During diagnosis (temporary):

.diagnosis/{issue-slug}/
├── notes.md          # Working notes, scratch
├── evidence/         # Screenshots, logs
└── report.md         # Structured findings

After resolution:
- Delete .diagnosis/ folder
- If valuable learning: Run /compound to capture permanently
- If led to design: docs/designs/{feature}/ has permanent record

Note: Add .diagnosis/ to .gitignore if not already there.

Presentation Templates

Simple Fix Complete

## Bug Fixed

**Symptom:** {original issue}
**Root Cause:** {1 sentence}
**Fix:** {1 sentence}

### Changes Made
- {file}: {change summary}

### Verified
- [x] Reproduction steps no longer trigger bug
- [x] Tests pass

### Commit
`{commit hash}` - {commit message}

---

**Learning opportunity?** If this bug reveals a pattern worth remembering, run `/compound`.

Escalation to Beads

## Diagnosis Complete

**Symptom:** {original issue}
**Root Cause:** {summary}
**Scope:** Localized ({N} files affected)

### Fix Approach
{High-level description}

### Beads Created
| Bead | Purpose |
|------|---------|
| {id} | {objective} |

---

**Next:** Run `/execute` to implement the fix.

Escalation to Brainstorm

## Diagnosis Complete - Design Required

**Symptom:** {original issue}
**Root Cause:** {summary}

### Why Design Is Needed
{This is too complex for a direct fix because...}

### Key Findings
- {finding}
- {finding}

### Questions for Design
- {question}
- {question}

---

**Next:** Run `/brainstorm {suggested feature description}`

Diagnostic context will inform the design phase.

Quality Standards

Evidence

Symptom documented with actual vs expected
Reproduction verified (or non-reproducibility documented)
Git history and test status checked
Learnings consulted

Isolation

Fault location narrowed systematically
Binary search or similar technique applied
Minimal reproducing case sought

Root Cause

5 Whys completed genuinely
Root cause explains symptom
Contributing factors identified
Appropriate category assigned

Triage

Scope honestly assessed
Complexity honestly assessed
Decision follows matrix
Rationale documented

Resolution

Fix is minimal (no scope creep)
Risk acknowledged
Verification completed
Learning opportunity offered

Anti-Patterns

❌ Fixing symptoms, not causes

"The null check here prevents the crash"
→ But WHY is it null? That's the real bug.

✅ Fixing root causes

"The object is null because initialization is skipped when X.
Fixed the initialization logic."

❌ Guessing without evidence

"I think the bug is probably in the authentication code"
→ Based on what evidence?

✅ Evidence-driven investigation

"Git blame shows auth code changed 3 days ago.
The symptom started appearing after that commit.
Reviewing that change..."

❌ Over-engineering the fix

"While fixing this, I also refactored the entire module
and added comprehensive error handling everywhere"

✅ Minimal targeted fix

"Fixed the specific null check that caused the crash.
The broader refactor could be valuable but should be
a separate brainstorm/design effort."

❌ Skipping reproduction

"The user says it crashes, let me look at the code..."
→ How will you know when it's fixed?

✅ Reproduce first

"Reproduced the crash with these steps: ...
Now I can verify the fix actually works."

❌ Premature escalation

"This seems complicated, let's do a full design"
→ Have you actually isolated the issue?

✅ Appropriate triage

"Isolated to single function, clear fix, low risk.
This is a fix-in-place, no need for design overhead."

Exit Signals

Signal	Meaning	Action
Fix applied	Simple bug resolved	Offer `/compound` for learning
Beads created	Medium fix ready	Proceed to `/execute`
Design required	Complex issue	Proceed to `/brainstorm`
Cannot reproduce	Insufficient info	Ask user for more details
Not a bug	Working as designed	Explain behavior to user
Park	Save for later	Document findings, deprioritize

Integration Points

→ /compound

After any fix, offer learning capture:
- Gotchas discovered
- Missing test coverage identified
- Patterns that prevent this bug class

→ /execute

For targeted beads:
- Pass diagnostic context in bead descriptions
- Include verification criteria

→ /brainstorm

For design-required issues:
- Root cause becomes part of problem statement
- Evidence informs documentation foundation
- Questions identified feed into design exploration

← Learnings

Always check docs/learnings/ for similar past issues:
- Avoids re-diagnosing known problems
- Applies known solutions
- Identifies recurring patterns

Skill version: 1.0
Approach: Evidence-driven root cause analysis with adaptive triage

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.