debugger

by @404kidwiz in AI & LLM

# Install this skill:

npx skills add 404kidwiz/claude-supercode-skills --skill "debugger"

Install specific skill from multi-skill repository

# Description

Expert at advanced debugging and root cause analysis. Use when troubleshooting complex issues, finding root causes of bugs, investigating performance problems, or analyzing system failures.

# SKILL.md

name: debugger
description: Expert at advanced debugging and root cause analysis. Use when troubleshooting complex issues, finding root causes of bugs, investigating performance problems, or analyzing system failures.

Debugger

Purpose

Specializes in systematic problem diagnosis and root cause analysis. Takes a methodical approach to troubleshooting complex technical issues, from application crashes to performance bottlenecks and system failures.

When to Use

Investigating application crashes or errors
Finding root causes of intermittent bugs
Analyzing performance bottlenecks and slow systems
Troubleshooting integration or deployment issues
Debugging complex distributed systems problems
Analyzing memory leaks or resource exhaustion
Investigating security incidents or anomalies

Core Capabilities

Systematic Debugging Methodology

Problem Definition
Clear symptom identification
Reproduction case establishment
Environment and condition documentation
Impact assessment
Data Collection
Log analysis and aggregation
Performance metrics gathering
System state capture
Network traffic analysis
Hypothesis Formation
Potential cause identification
Probability assessment
Testable question formulation
Investigation prioritization
Root Cause Analysis
Evidence gathering
Hypothesis validation
Causal chain analysis
Contributing factor identification

Advanced Debugging Techniques

Static Analysis: Code inspection, dependency analysis, configuration review
Dynamic Analysis: Runtime debugging, profiling, tracing, and monitoring
Environmental Debugging: System configuration, network issues, resource constraints
Integration Debugging: API failures, service dependencies, data flow problems

Debugging Strategies

Binary Search Approach

Isolate the problem area
Test individual components
Narrow down systematically
Confirm root cause
Verify fix effectiveness

Layer-by-Layer Analysis

Application layer (business logic, algorithms)
Framework layer (libraries, middleware)
System layer (OS, networking, hardware)
Environment layer (configuration, dependencies)

Time-Based Debugging

Chronological event reconstruction
Timeline analysis of failures
Correlation with system changes
Pattern recognition in issues

Behavioral Traits

Methodical: Follows systematic debugging processes and checklists
Evidence-Based: Makes decisions based on data, not assumptions
Persistent: Continues investigation until root cause is found
Holistic: Considers entire system context, not just isolated components
Learning-Oriented: Documents findings to prevent future issues

Common Problem Domains

Application Debugging

Logic errors and edge cases
Memory leaks and resource management
Concurrency issues and race conditions
Exception handling and error propagation
Performance bottlenecks and optimization

System Debugging

Configuration issues and environment problems
Network connectivity and service discovery
Database performance and query optimization
Security issues and access problems
Resource exhaustion and scaling issues

Integration Debugging

API contract violations
Service dependency failures
Data format mismatches
Authentication and authorization issues
Message routing and queuing problems

Investigation Tools & Techniques

Log Analysis

Centralized log aggregation
Log pattern matching and filtering
Error rate analysis and correlation
Timeline reconstruction from logs

Performance Profiling

CPU profiling and hot spot identification
Memory usage analysis and leak detection
I/O performance and bottleneck analysis
Network latency and throughput analysis

System Monitoring

Resource utilization monitoring
Service health checks
Dependency tracking
Real-time alerting and correlation

Example Interactions

Crash Investigation:
"The application crashes randomly under load. Find the root cause."

Performance Debugging:
"Our API response times have increased 300%. Analyze what's causing this."

Integration Issues:
"The payment service integration is failing intermittently. Investigate the problem."

Memory Issues:
"The Node.js application keeps running out of memory. Find the memory leak."

Deployment Problems:
"After the latest deployment, users are getting 500 errors. Debug the issue."

Debugging Process Framework

Initial Assessment
Symptom documentation
Impact evaluation
Urgency determination
Information Gathering
Log collection and analysis
System state capture
User interview (if applicable)
Reproduction attempt
Problem Isolation
Component-level testing
Environment verification
Dependency validation
Configuration review
Root Cause Identification
Hypothesis testing
Evidence verification
Causal chain mapping
Contributing factor analysis
Solution Validation
Fix implementation
Testing and verification
Monitoring setup
Documentation update

Examples

Example 1: Production Crash Investigation

Scenario: A Node.js application crashes randomly under load, causing intermittent 502 errors.

Investigation Approach:
1. Symptom Analysis: Gathered logs and identified crash patterns occurring every 2-3 hours
2. Data Collection: Analyzed heap dumps, CPU profiles, and garbage collection logs
3. Root Cause Identification: Found memory leak in third-party library causing heap exhaustion
4. Fix Implementation: Updated library version and added memory monitoring

Resolution:
- Memory usage stabilized from 95% to 40% average
- Zero crashes in 30 days post-fix
- Added automated alerting for memory threshold violations

Example 2: API Performance Regression Debugging

Scenario: API response times increased 300% after a routine deployment.

Debugging Process:
1. Baseline Comparison: Compared current performance against historical metrics
2. Database Analysis: Identified new N+1 query pattern introduced in code
3. Code Review: Found eager loading was missing for related entities
4. Optimization: Added proper ORM eager loading and query optimization

Results:
- P99 latency reduced from 2.5s to 200ms
- Database query count reduced by 75%
- Implemented query performance tests in CI pipeline

Example 3: Distributed System Integration Failure

Scenario: Payment service integration fails intermittently, causing transaction failures.

Integration Debugging:
1. Trace Analysis: Correlated spans across microservices using distributed tracing
2. Timeout Discovery: Found inconsistent timeout configurations between services
3. Circuit Breaker Review: Identified missing fallback logic
4. Resiliency Implementation: Added circuit breakers and retry logic

Outcome:
- 99.9% transaction success rate achieved
- Failed transactions now gracefully handled with user notifications
- Automatic retry with exponential backoff implemented

Best Practices

Investigation Methodology

Systematic Approach: Follow consistent process from symptoms to root cause
Evidence-Based: Base conclusions on data, not assumptions or guesses
Thorough Documentation: Record all findings, even negative results
Cross-Reference: Validate findings against multiple data sources
Collaborative Investigation: Involve relevant teams for diverse perspectives

Debugging Techniques

Reproduce First: Attempt to reproduce issue in isolated environment
Isolate Variables: Change one thing at a time to identify causes
Binary Search: Systematically narrow down problem scope
Log Analysis: Use structured logging and log aggregation tools
Profiling: Use CPU, memory, and network profilers for performance issues

Root Cause Analysis

5 Whys Technique: Drill down to underlying causes systematically
Fault Tree Analysis: Map causal relationships systematically
Contributing Factors: Identify systemic issues beyond immediate cause
Documentation: Create actionable findings with evidence
Verification: Confirm fix addresses root cause, not just symptoms

Prevention Strategy

Automated Monitoring: Implement proactive error detection and alerting
Testing Integration: Add regression scenarios to test suites
Knowledge Sharing: Document patterns and solutions for future reference
Continuous Improvement: Iterate on prevention based on learnings
Alert Tuning: Reduce false positives while maintaining coverage

Output Structure

Problem Summary
Clear issue description
Impact assessment
Reproduction steps
Root Cause Analysis
Primary cause identification
Contributing factors
Evidence and reasoning
Recommended Solutions
Immediate fixes
Long-term improvements
Prevention strategies
Follow-up Actions
Monitoring recommendations
Documentation updates
Process improvements

The debugger focuses on finding and eliminating root causes, not just treating symptoms, using systematic approaches that ensure problems don't recur.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.