systematic-debugging

Name: systematic-debugging
Rating: 5 (2 reviews)
Author: izyanrajwani

by @izyanrajwani in AI & LLM

# Install this skill:

npx skills add izyanrajwani/agent-skills-library --skill "systematic-debugging"

Install specific skill from multi-skill repository

# Description

Root cause analysis for debugging. Use when bugs, test failures, or unexpected behavior have non-obvious causes, or after multiple fix attempts have failed.

# SKILL.md

name: systematic-debugging
description: Root cause analysis for debugging. Use when bugs, test failures, or unexpected behavior have non-obvious causes, or after multiple fix attempts have failed.

Systematic Debugging

Core principle: Find root cause before attempting fixes. Symptom fixes are failure.

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

Phase 1: Root Cause Investigation

BEFORE attempting ANY fix:

Read Error Messages Carefully
Read stack traces completely
Note line numbers, file paths, error codes
Don't skip warnings
Reproduce Consistently
What are the exact steps?
If not reproducible → gather more data, don't guess
Check Recent Changes
Git diff, recent commits
New dependencies, config changes
Environmental differences
Gather Evidence in Multi-Component Systems

WHEN system has multiple components (CI → build → signing, API → service → database):

Add diagnostic instrumentation before proposing fixes:
```
For EACH component boundary:
- Log what data enters/exits component
- Verify environment/config propagation
- Check state at each layer

Run once to gather evidence → analyze → identify failing component
```

Example:
```bash
# Layer 1: Workflow
echo "=== Secrets available: ==="
echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"

# Layer 2: Build script
env | grep IDENTITY || echo "IDENTITY not in environment"

# Layer 3: Signing
security find-identity -v
```

Trace Data Flow

See references/root-cause-tracing.md for backward tracing technique.

Quick version: Where does bad value originate? Trace up call chain until you find the source. Fix at source.

Phase 2: Pattern Analysis

Find Working Examples - Similar working code in codebase
Compare Against References - Read reference implementations COMPLETELY, don't skim
Identify Differences - List every difference, don't assume "that can't matter"
Understand Dependencies - Components, config, environment, assumptions

Phase 3: Hypothesis and Testing

Form Single Hypothesis - "I think X is root cause because Y" - be specific
Test Minimally - SMALLEST possible change, one variable at a time
Verify - Worked → Phase 4. Didn't work → form NEW hypothesis, don't stack fixes
When You Don't Know - Say so. Don't pretend.

Phase 4: Implementation

Create Failing Test Case
Use the test-driven-development skill
MUST have before fixing
Implement Single Fix
ONE change at a time
No "while I'm here" improvements
Verify Fix
Test passes? Other tests still pass? Issue resolved?
If Fix Doesn't Work
Count attempts
If < 3: Return to Phase 1 with new information
If ≥ 3: Escalate (below)

Escalation: 3+ Failed Fixes

Pattern indicating architectural problem:
- Each fix reveals new problems elsewhere
- Fixes require massive refactoring
- Shared state/coupling keeps surfacing

Action: STOP. Question fundamentals:
- Is this pattern fundamentally sound?
- Are we continuing through inertia?
- Refactor architecture vs. continue fixing symptoms?

Discuss with human partner before more fix attempts. This is wrong architecture, not failed hypothesis.

Red Flags → STOP and Return to Phase 1

If you catch yourself thinking:
- "Quick fix for now, investigate later"
- "Just try changing X"
- "I'll skip the test"
- "It's probably X"
- "Pattern says X but I'll adapt it differently"
- Proposing solutions before tracing data flow
- "One more fix" after 2+ failures

Human Signals You're Off Track

"Is that not happening?" → You assumed without verifying
"Will it show us...?" → You should have added evidence gathering
"Stop guessing" → You're proposing fixes without understanding
"Ultrathink this" → Question fundamentals
Frustrated "We're stuck?" → Your approach isn't working

Response: Return to Phase 1.

Supporting Techniques

Reference files in references/:
- root-cause-tracing.md - Trace bugs backward through call stack
- defense-in-depth.md - Add validation at multiple layers after finding root cause
- condition-based-waiting.md - Replace arbitrary timeouts with condition polling

Related skills:
- test-driven-development - Creating failing test case (Phase 4)
- verification-before-completion - Verify fix before claiming success

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.