williamzujkowski

Incident Response Playbook Generator

3
0
# Install this skill:
npx skills add williamzujkowski/cognitive-toolworks --skill "Incident Response Playbook Generator"

Install specific skill from multi-skill repository

# Description

Generate incident response playbooks for security incidents, outages, and disaster recovery with NIST SP 800-61 compliance and escalation paths.

# SKILL.md


name: "Incident Response Playbook Generator"
slug: "resilience-incident-generator"
description: "Generate incident response playbooks for security incidents, outages, and disaster recovery with NIST SP 800-61 compliance and escalation paths."
capabilities:
- Security incident response playbook generation
- Production outage runbook creation
- Disaster recovery scenario planning
- Escalation matrix design
- Post-mortem template generation
- NIST SP 800-61 lifecycle compliance
- On-call rotation and paging integration
- Communication plan templates
inputs:
- incident_type: "security | outage | disaster-recovery | data-breach | ransomware | ddos | service-degradation (string)"
- severity_level: "P0 (critical) | P1 (high) | P2 (medium) | P3 (low) (string, default: P1)"
- service_context: "service name, architecture, dependencies (object, optional)"
- compliance_requirements: "NIST, SOC2, HIPAA, PCI-DSS (array, optional)"
- tier: "T1 (template) | T2 (detailed playbook) (string, default: T1)"
outputs:
- playbook: "NIST SP 800-61 structured playbook with phases"
- escalation_matrix: "contact list with escalation thresholds"
- runbook: "step-by-step remediation procedures"
- post_mortem_template: "structured incident report template"
- communication_plan: "stakeholder notification templates"
keywords:
- incident-response
- disaster-recovery
- playbook
- runbook
- nist-800-61
- security-incident
- outage
- escalation
- post-mortem
- on-call
version: "1.0.0"
owner: "cognitive-toolworks"
license: "MIT"
security: "Public; no secrets or PII; safe for open repositories"
links:
- https://csrc.nist.gov/publications/detail/sp/800-61/rev-2/final
- https://www.atlassian.com/incident-management/incident-response
- https://response.pagerduty.com/
- https://www.sans.org/white-papers/33901/
- https://cloud.google.com/architecture/incident-response
- https://incidentresponse.com/playbooks/


Purpose & When-To-Use

Trigger conditions:
- Security incident detected (malware, data breach, unauthorized access, ransomware, DDoS)
- Production outage or service degradation impacting customers
- Disaster recovery event (data center failure, regional outage, natural disaster)
- Post-incident review requiring playbook formalization
- Compliance requirement to document incident response procedures (SOC2, FedRAMP, HIPAA, PCI-DSS)
- New service launch requiring incident runbooks
- On-call rotation setup needing escalation paths

Not for:
- Real-time incident coordination (use incident management platforms)
- Automated incident detection (use monitoring/alerting systems)
- Forensic analysis execution (provides methodology only)
- Legal incident disclosure decisions (consult legal counsel)


Pre-Checks

Time normalization:
- Compute NOW_ET using NIST/time.gov semantics (America/New_York, ISO-8601): 2025-10-25T21:30:36-04:00
- Use NOW_ET for all citation access dates

Input validation:
- incident_type must be: security, outage, disaster-recovery, data-breach, ransomware, ddos, service-degradation
- severity_level must be: P0, P1, P2, or P3
- service_context (if provided) must include: service_name, team_owner, dependencies
- compliance_requirements must be valid framework identifiers
- tier must be: T1 or T2

Source freshness:
- NIST SP 800-61 Rev 2 (accessed 2025-10-25T21:30:36-04:00): https://csrc.nist.gov/publications/detail/sp/800-61/rev-2/final - Computer Security Incident Handling Guide
- Atlassian Incident Management (accessed 2025-10-25T21:30:36-04:00): https://www.atlassian.com/incident-management/incident-response
- PagerDuty Incident Response (accessed 2025-10-25T21:30:36-04:00): https://response.pagerduty.com/
- SANS Incident Handler's Handbook (accessed 2025-10-25T21:30:36-04:00): https://www.sans.org/white-papers/33901/

Dependency validation:
- Security incidents leverage: security-assessment-framework (for threat context)
- No hard dependencies for T1 (template generation)


Procedure

T1: Playbook Template (≀2k tokens)

Fast path for 80% of standard playbook needs:

  1. Incident classification:
  2. Map incident_type to NIST SP 800-61 category
  3. Assign severity level based on severity_level input
  4. Identify compliance requirements (if any)

  5. Generate playbook structure:

  6. Phase 1: Preparation - Pre-incident setup (tools, contacts, access)
  7. Phase 2: Detection & Analysis - Incident identification and scoping
  8. Phase 3: Containment - Short-term and long-term containment steps
  9. Phase 4: Eradication - Root cause removal
  10. Phase 5: Recovery - Service restoration and validation
  11. Phase 6: Post-Incident Activity - Lessons learned and documentation

  12. Create escalation matrix:

  13. Define escalation thresholds by severity (P0 ≀15min, P1 ≀30min, P2 ≀2hr, P3 ≀1day)
  14. Template contact roles: Incident Commander, Tech Lead, Communications Lead, Executive Sponsor
  15. Include paging instructions (PagerDuty, Opsgenie, custom)

  16. Output deliverables:

  17. Playbook markdown document (NIST SP 800-61 aligned)
  18. Escalation matrix CSV/JSON
  19. Post-mortem template with 5 Whys framework

Token budget: T1 ≀2k tokens (template only, no deep context)


T2: Detailed Playbook with Service Context (≀6k tokens)

Extended path for service-specific, compliance-driven playbooks:

  1. Enhanced incident analysis (extends T1):
  2. Analyze service_context to identify critical dependencies
  3. Map service architecture to failure modes (single points of failure, cascading failures)
  4. Identify compliance-specific requirements (HIPAA breach notification timelines, PCI-DSS forensic preservation)

  5. Service-specific runbook generation:

  6. Create detailed remediation steps for common failure scenarios
  7. Include rollback procedures and health check validation
  8. Add monitoring query examples (Prometheus, Datadog, CloudWatch)
  9. Document safe restart procedures and dependency startup order

  10. Compliance integration:

  11. NIST SP 800-61: Map playbook phases to incident handling lifecycle
  12. SOC2 CC7.3: Document incident response communications
  13. HIPAA: Add breach notification timelines (60-day requirement)
  14. PCI-DSS 12.10: Include forensic evidence preservation steps
  15. FedRAMP: Reference IR-4 and IR-6 controls from NIST SP 800-53

  16. Communication plan generation:

  17. Internal stakeholder notification templates (engineering, support, executives)
  18. External communication templates (customer status page, regulatory notifications)
  19. Severity-based communication cadence (P0: every 30min, P1: hourly, P2: daily)

  20. Post-mortem template customization:

  21. Include service-specific incident timeline
  22. Root cause analysis framework (5 Whys, Fishbone diagram)
  23. Action items with owners and due dates
  24. Metrics: MTTD (Mean Time to Detect), MTTR (Mean Time to Resolve), customer impact

  25. Decision rules for escalation:

  26. Auto-escalate if incident duration exceeds: P0=30min, P1=2hr, P2=8hr
  27. Auto-escalate if customer impact exceeds: P0=any, P1=10%, P2=25%
  28. Invoke disaster recovery if: data center failure, regional outage, ransomware with data encryption

Token budget: T2 ≀6k tokens (includes service context, compliance, and communication plans)


Decision Rules

Incident type routing:
- security | data-breach | ransomware β†’ Include forensic preservation steps, consider invoking security-assessment-framework
- outage | service-degradation β†’ Focus on MTTR reduction, rollback procedures, health checks
- disaster-recovery β†’ Invoke DR site failover procedures, RTO/RPO validation
- ddos β†’ Include traffic analysis, rate limiting, upstream provider coordination

Severity thresholds (auto-escalation triggers):
- P0 (critical): Customer-facing impact, data breach, ransomware β†’ Escalate to VP/C-level within 15 minutes
- P1 (high): Partial service degradation, security incident contained β†’ Escalate to Director within 30 minutes
- P2 (medium): Internal systems impacted, no customer impact β†’ Escalate to Manager within 2 hours
- P3 (low): Minor issues, no service impact β†’ Standard on-call escalation

Compliance-driven requirements:
- HIPAA data breach β†’ Invoke 60-day breach notification requirement, add HHS reporting steps
- PCI-DSS incident β†’ Add forensic investigation and PCI QSA notification
- SOC2 incident β†’ Document communications per CC7.3 requirement
- FedRAMP incident β†’ Report to Agency within 1 hour for P0 incidents per IR-6(1)

Abort conditions:
- If incident_type is unknown/invalid β†’ Request clarification
- If service_context missing for T2 β†’ Downgrade to T1 or request architecture details
- If compliance requirements conflict β†’ Flag for manual review and legal consultation


Output Contract

Required fields (all tiers):

playbook:
  incident_type: string
  severity: "P0" | "P1" | "P2" | "P3"
  nist_phases:
    - phase: "Preparation" | "Detection & Analysis" | "Containment" | "Eradication" | "Recovery" | "Post-Incident"
      steps: array[string]
      duration_estimate: string
      success_criteria: string
  escalation_matrix:
    - role: string
      contact_method: string
      escalation_threshold: string
  post_mortem_template:
    incident_summary: string
    timeline: array[{timestamp, event, actor}]
    root_cause: string
    impact: {customers_affected, duration, revenue_impact}
    action_items: array[{owner, description, due_date, priority}]

runbook: # T2 only
  service_name: string
  failure_modes: array[{scenario, symptoms, remediation_steps}]
  rollback_procedure: array[string]
  health_checks: array[{name, command, expected_result}]
  dependencies: array[{service, startup_order, health_endpoint}]

communication_plan: # T2 only
  internal_stakeholders: array[{role, notification_threshold, channel}]
  external_communication: array[{audience, template, approval_required}]
  status_page_updates: {cadence, template}

Format: JSON or YAML (consumer specifies)

Guarantees:
- All playbooks follow NIST SP 800-61 Rev 2 incident handling lifecycle
- Escalation thresholds are severity-appropriate and time-bounded
- Post-mortem templates include 5 Whys or equivalent root cause analysis
- Compliance requirements mapped to specific playbook steps


Examples

Input:

{
  "incident_type": "data-breach",
  "severity_level": "P0",
  "compliance_requirements": ["HIPAA", "SOC2"],
  "tier": "T1"
}

Output (abbreviated):

playbook:
  incident_type: data-breach
  severity: P0
  nist_phases:
    - phase: Containment
      steps:
        - Isolate affected systems from network
        - Preserve forensic evidence (logs, memory dumps)
        - Revoke compromised credentials
      duration_estimate: 30-60 minutes
    - phase: Post-Incident
      steps:
        - HIPAA breach notification to HHS within 60 days
        - SOC2 CC7.3 communication documentation
  escalation_matrix:
    - {role: CISO, contact: PagerDuty, threshold: "15 min"}
    - {role: Legal, contact: Email, threshold: "30 min"}

Quality Gates

Token budgets:
- T1 ≀2k tokens: Template-based playbook generation, no deep service context
- T2 ≀6k tokens: Service-specific runbooks with compliance integration
- T3: Not implemented (incident response is sufficiently covered by T1/T2 tiers)

Safety:
- No embedded credentials or API keys in playbooks
- No PII in example scenarios
- Compliance requirements are technical controls only (not legal advice)

Auditability:
- All NIST SP 800-61 citations include access date = NOW_ET
- Compliance mappings traceable to source frameworks
- Escalation thresholds based on industry standards (PagerDuty, Atlassian)

Determinism:
- Same inputs β†’ same playbook structure
- Escalation thresholds are severity-based and predictable
- NIST phases always in lifecycle order: Preparation β†’ Detection β†’ Containment β†’ Eradication β†’ Recovery β†’ Post-Incident

Validation:
- Playbook must include all 6 NIST SP 800-61 phases
- Escalation matrix must define contact methods and thresholds
- Post-mortem template must include timeline, root cause, and action items


Resources

Primary sources (NIST SP 800-61 compliance):
- NIST SP 800-61 Rev 2: Computer Security Incident Handling Guide (accessed 2025-10-25T21:30:36-04:00): https://csrc.nist.gov/publications/detail/sp/800-61/rev-2/final
- NIST SP 800-53 Rev 5: IR-4 (Incident Handling), IR-6 (Incident Reporting) (accessed 2025-10-25T21:30:36-04:00): https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final

Industry best practices:
- Atlassian Incident Management Handbook (accessed 2025-10-25T21:30:36-04:00): https://www.atlassian.com/incident-management/incident-response
- PagerDuty Incident Response Documentation (accessed 2025-10-25T21:30:36-04:00): https://response.pagerduty.com/
- Google SRE Book: Managing Incidents (accessed 2025-10-25T21:30:36-04:00): https://sre.google/sre-book/managing-incidents/
- SANS Incident Handler's Handbook (accessed 2025-10-25T21:30:36-04:00): https://www.sans.org/white-papers/33901/

Compliance frameworks:
- HIPAA Breach Notification Rule (accessed 2025-10-25T21:30:36-04:00): https://www.hhs.gov/hipaa/for-professionals/breach-notification/index.html
- PCI DSS v4.0 Requirement 12.10 (accessed 2025-10-25T21:30:36-04:00): https://www.pcisecuritystandards.org/document_library
- SOC2 Trust Services Criteria CC7.3 (accessed 2025-10-25T21:30:36-04:00): https://www.aicpa.org/interestareas/frc/assuranceadvisoryservices/aicpasoc2report.html

Templates and tools:
- See /skills/resilience-incident-generator/resources/ for:
- playbook-template.md - NIST SP 800-61 aligned playbook structure
- escalation-matrix.csv - Contact escalation template
- post-mortem-template.md - 5 Whys root cause analysis template
- runbook-template.md - Service-specific runbook structure

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.