typed-holes-refactor

by @rand in Development

# Install this skill:

npx skills add rand/cc-polymath --skill "typed-holes-refactor"

Install specific skill from multi-skill repository

# Description

Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development.

# SKILL.md

name: typed-holes-refactor
description: Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development.

Typed Holes Refactoring

Systematically refactor codebases using the Design by Typed Holes meta-framework: treat architectural unknowns as typed holes, resolve them iteratively with test-driven validation, and propagate constraints through dependency graphs.

Core Workflow

Phase 0: Hole Discovery & Setup

1. Create safe working branch:

git checkout -b refactor/typed-holes-v1
# CRITICAL: Never work in main, never touch .beads/ in main

2. Analyze current state and identify holes:

python scripts/discover_holes.py
# Creates REFACTOR_IR.md with hole catalog

The Refactor IR documents:
- Current State Holes: What's unknown about the current system?
- Refactor Holes: What needs resolution to reach the ideal state?
- Constraints: What must be preserved/improved/maintained?
- Dependencies: Which holes block which others?

3. Write baseline characterization tests:

Create tests/characterization/ to capture exact current behavior:

# tests/characterization/test_current_behavior.py
def test_api_contracts():
    """All public APIs must behave identically post-refactor"""
    for endpoint in discover_public_apis():
        old_result = run_current(endpoint, test_inputs)
        save_baseline(endpoint, old_result)

def test_performance_baselines():
    """Record current performance - don't regress"""
    baselines = measure_all_operations()
    save_json("baselines.json", baselines)

Run tests on main branch - they should all pass. These are your safety net.

Phase 1-N: Iterative Hole Resolution

For each hole (in dependency order):

1. Select next ready hole:

python scripts/next_hole.py
# Shows holes whose dependencies are resolved

2. Write validation tests FIRST (test-driven):

# tests/refactor/test_h{N}_resolution.py
def test_h{N}_resolved():
    """Define what 'resolved correctly' means"""
    # This should FAIL initially
    assert desired_state_achieved()

def test_h{N}_equivalence():
    """Ensure no behavioral regressions"""
    old_behavior = load_baseline()
    new_behavior = run_refactored()
    assert old_behavior == new_behavior

3. Implement resolution:

Refactor code to make tests pass
Keep characterization tests passing
Commit incrementally with clear messages

4. Validate resolution:

python scripts/validate_resolution.py H{N}
# Checks: tests pass, constraints satisfied, main untouched

5. Propagate constraints:

python scripts/propagate.py H{N}
# Updates dependent holes based on resolution

6. Document and commit:

git add .
git commit -m "Resolve H{N}: {description}

- Tests: tests/refactor/test_h{N}_*.py pass
- Constraints: {constraints satisfied}
- Propagates to: {dependent holes}"

Phase Final: Reporting

Generate comprehensive delta report:

python scripts/generate_report.py > REFACTOR_REPORT.md

Report includes:
- Hole resolution summary with validation evidence
- Metrics delta (LOC, complexity, coverage, performance)
- Behavioral analysis (intentional changes documented)
- Constraint validation (all satisfied)
- Risk assessment and migration guide

Key Principles

1. Test-Driven Everything

Write validation criteria BEFORE implementing
Tests define "correct resolution"
Characterization tests are sacred - never let them fail

2. Hole-Driven Progress

Resolve holes in dependency order
Each resolution propagates constraints
Track everything formally in Refactor IR

3. Continuous Validation

Every commit must validate:
- ✅ Characterization tests pass (behavior preserved)
- ✅ Resolution tests pass (hole resolved correctly)
- ✅ Constraints satisfied
- ✅ Main branch untouched
- ✅ .beads/ intact in main

4. Safe by Construction

Work only in refactor branch
Main is read-only reference
Beads are untouchable historical artifacts

5. Formal Completeness

Design complete when:
- All holes resolved and validated
- All constraints satisfied
- All phase gates passed
- Metrics improved or maintained

Hole Quality Framework

SMART Criteria for Good Holes

Every hole must be:

Specific: Clear, bounded question with concrete answer
✓ Good: "How should error handling work in the API layer?"
✗ Bad: "How to improve the code?"
Measurable: Has testable validation criteria
✓ Good: "Reduce duplication from 60% to <15%"
✗ Bad: "Make code better"
Achievable: Can be resolved with available information
✓ Good: "Extract parsing logic to separate module"
✗ Bad: "Predict all future requirements"
Relevant: Blocks meaningful progress on refactoring
✓ Good: "Define core interface (blocks 5 other holes)"
✗ Bad: "Decide variable naming convention"
Typed: Clear type/structure for resolution
✓ Good: interface Architecture = { layers: Layer[], rules: Rule[] }
✗ Bad: "Some kind of structure?"

Hole Estimation Framework

Size holes using these categories:

Size	Duration	Characteristics	Examples
Nano	1-2 hours	Simple, mechanical changes	Rename files, update imports
Small	4-8 hours	Single module refactor	Extract class, consolidate functions
Medium	1-3 days	Cross-module changes	Define interfaces, reorganize packages
Large	4-7 days	Architecture changes	Layer extraction, pattern implementation
Epic	>7 days	SPLIT THIS HOLE	Too large, break into smaller holes

Estimation Red Flags:
- More than 3 dependencies → Likely Medium+
- Unclear validation → Add time for discovery
- New patterns/tools → Add learning overhead

Hole Splitting Guidelines

Split a hole when:
1. Estimate exceeds 7 days
2. More than 5 dependencies
3. Validation criteria unclear
4. Multiple distinct concerns mixed

Splitting strategy:

Epic hole: "Refactor entire authentication system"
→ Split into:
  R10_auth_interface: Define new auth interface (Medium)
  R11_token_handling: Implement JWT tokens (Small)
  R12_session_management: Refactor sessions (Medium)
  R13_auth_middleware: Update middleware (Small)
  R14_auth_testing: Comprehensive test suite (Medium)

After splitting:
- Update dependencies in REFACTOR_IR.md
- Run python scripts/propagate.py to update graph
- Re-sync with beads: python scripts/holes_to_beads.py

Common Hole Types

Architecture Holes

"?R1_target_architecture": "What should the ideal structure be?"
"?R2_module_boundaries": "How should modules be organized?"
"?R3_abstraction_layers": "What layers/interfaces are needed?"

Validation: Architecture tests, dependency analysis, layer violation checks

Implementation Holes

"?R4_consolidation_targets": "What code should merge?"
"?R5_extraction_targets": "What code should split out?"
"?R6_elimination_targets": "What code should be removed?"

Validation: Duplication detection, equivalence tests, dead code analysis

Quality Holes

"?R7_test_strategy": "How to validate equivalence?"
"?R8_migration_path": "How to safely transition?"
"?R9_rollback_mechanism": "How to undo if needed?"

Validation: Test coverage metrics, migration dry-runs, rollback tests

See HOLE_TYPES.md for complete catalog.

Constraint Propagation Rules

Rule 1: Interface Resolution → Type Constraints

When: Interface hole resolved with concrete types
Then: Propagate type requirements to all consumers

Example:
  Resolve R6: NodeInterface = BaseNode with async run()
  Propagates to:
    → R4: Parallel execution must handle async
    → R5: Error recovery must handle async exceptions

Rule 2: Implementation → Performance Constraints

When: Implementation resolved with resource usage
Then: Propagate limits to dependent holes

Example:
  Resolve R4: Parallelization with max_concurrent=3
  Propagates to:
    → R8: Rate limit = provider_limit / 3
    → R7: Memory budget = 3 * single_operation_memory

Rule 3: Validation → Test Requirements

When: Validation resolved with test requirements
Then: Propagate data needs upstream

Example:
  Resolve R9: Testing needs 50 examples
  Propagates to:
    → R7: Metrics must support batch evaluation
    → R8: Test data collection strategy needed

See CONSTRAINT_RULES.md for complete propagation rules.

Success Indicators

Weekly Progress

2-4 holes resolved
All tests passing
Constraints satisfied
Measurable improvements

Red Flags (Stop & Reassess)

❌ Characterization tests fail
❌ Hole can't be resolved within constraints
❌ Constraints contradict each other
❌ No progress for 3+ days
❌ Main branch accidentally modified

Validation Gates

Gate	Criteria	Check
Gate 1: Discovery Complete	All holes cataloged, dependencies mapped	`python scripts/check_discovery.py`
Gate 2: Foundation Holes	Core interfaces resolved, tests pass	`python scripts/check_foundation.py`
Gate 3: Implementation	All refactor holes resolved, metrics improved	`python scripts/check_implementation.py`
Gate 4: Production Ready	Migration tested, rollback verified	`python scripts/check_production.py`

Claude-Assisted Workflow

This skill is designed for effective Claude/LLM collaboration. Here's how to divide work:

Phase 0: Discovery

Claude's Role:
- Run discover_holes.py to analyze codebase
- Suggest holes based on code analysis
- Generate initial REFACTOR_IR.md structure
- Write characterization tests to capture current behavior
- Set up test infrastructure

Your Role:
- Confirm holes are well-scoped
- Prioritize which holes to tackle first
- Review and approve REFACTOR_IR.md
- Define critical constraints

Phase 1-N: Hole Resolution

Claude's Role:
- Write resolution tests (TDD) BEFORE implementation
- Implement hole resolution to make tests pass
- Run validation scripts: validate_resolution.py, check_foundation.py
- Update REFACTOR_IR.md with resolution details
- Propagate constraints: python scripts/propagate.py H{N}
- Generate commit messages documenting changes

Your Role:
- Make architecture decisions (which pattern, which approach)
- Assess risk and determine constraint priorities
- Review code changes for correctness
- Approve merge to main when complete

Phase Final: Reporting

Claude's Role:
- Generate comprehensive REFACTOR_REPORT.md
- Document all metrics deltas
- List all validation evidence
- Create migration guides
- Prepare PR description

Your Role:
- Final review of report accuracy
- Approve for production deployment
- Conduct post-refactor retrospective

Effective Prompting Patterns

Starting a session:

"I need to refactor [description]. Use typed-holes-refactor skill.
Start with discovery phase."

Resolving a hole:

"Resolve H3 (target_architecture). Write tests first, then implement.
Use [specific pattern/approach]."

Checking progress:

"Run check_completeness.py and show me the dashboard.
What's ready to work on next?"

Generating visualizations:

"Generate dependency graph showing bottlenecks and critical path.
Use visualize_graph.py with --analyze."

Claude's Limitations

Claude CANNOT:
- Make subjective architecture decisions (you must decide)
- Determine business-critical constraints (you must specify)
- Run tests that require external services (mock or you run them)
- Merge to main (you must approve and merge)

Claude CAN:
- Analyze code and suggest holes
- Write comprehensive test suites
- Implement resolutions within your constraints
- Generate reports and documentation
- Track progress across sessions (via beads + REFACTOR_IR.md)

Multi-Session Continuity

At session start:

"Continue typed-holes refactoring. Import beads state and
show current status from REFACTOR_IR.md."

Claude will:
- Read REFACTOR_IR.md to understand current state
- Check which holes are resolved
- Identify next ready holes
- Resume where previous session left off

You should:
- Keep REFACTOR_IR.md and .beads/ committed to git
- Export beads state at session end: bd export -o .beads/issues.jsonl
- Use /context before starting to ensure Claude has full context

Beads Integration

Why beads + typed holes?
- Beads tracks issues across sessions (prevents lost work)
- Holes track refactoring-specific state (dependencies, constraints)
- Together: Complete continuity for long-running refactors

Setup

# Install beads (once)
go install github.com/steveyegge/beads/cmd/bd@latest

# After running discover_holes.py
python scripts/holes_to_beads.py

# Check what's ready
bd ready --json

Workflow Integration

During hole resolution:

# Start work on a hole
bd update bd-5 --status in_progress --json

# Implement resolution
# ... write tests, implement code ...

# Validate resolution
python scripts/validate_resolution.py H3

# Close bead
bd close bd-5 --reason "Resolved H3: target_architecture" --json

# Export state
bd export -o .beads/issues.jsonl
git add .beads/issues.jsonl REFACTOR_IR.md
git commit -m "Resolve H3: Define target architecture"

Syncing holes ↔ beads:

# After updating REFACTOR_IR.md manually
python scripts/holes_to_beads.py  # Sync changes to beads

# After resolving holes
python scripts/holes_to_beads.py  # Update bead statuses

Cross-session continuity:

# Session start
bd import -i .beads/issues.jsonl
bd ready --json  # Shows ready holes
python scripts/check_completeness.py  # Shows overall progress

# Session end
bd export -o .beads/issues.jsonl
git add .beads/issues.jsonl
git commit -m "Session checkpoint: 3 holes resolved"

Bead advantages:
- Tracks work across days/weeks
- Shows dependency graph: bd deps bd-5
- Prevents context loss
- Integrates with overall project management

Scripts Reference

All scripts are in scripts/:

discover_holes.py - Analyze codebase and generate REFACTOR_IR.md
next_hole.py - Show next resolvable holes based on dependencies
validate_resolution.py - Check if hole resolution satisfies constraints
propagate.py - Update dependent holes after resolution
generate_report.py - Create comprehensive delta report
check_discovery.py - Validate Phase 0 completeness (Gate 1)
check_foundation.py - Validate Phase 1 completeness (Gate 2)
check_implementation.py - Validate Phase 2 completeness (Gate 3)
check_production.py - Validate Phase 3 readiness (Gate 4)
check_completeness.py - Overall progress dashboard
visualize_graph.py - Generate hole dependency visualization
holes_to_beads.py - Sync holes with beads issues

Run any script with --help for detailed usage.

Meta-Consistency

This skill uses its own principles:

Typed Holes Principle	Application to Refactoring
Typed Holes	Architectural unknowns cataloged with types
Constraint Propagation	Design constraints flow through dependency graph
Iterative Refinement	Hole-by-hole resolution cycles
Test-Driven Validation	Tests define correctness
Formal Completeness	Gates verify design completeness

We use the system to refactor the system.

Advanced Topics

For complex scenarios, see:

HOLE_TYPES.md - Detailed hole taxonomy
CONSTRAINT_RULES.md - Complete propagation rules
VALIDATION_PATTERNS.md - Test patterns for different hole types
EXAMPLES.md - Complete worked examples

Quick Start Example

# 1. Setup
git checkout -b refactor/typed-holes-v1
python scripts/discover_holes.py

# 2. Write baseline tests
# Create tests/characterization/test_*.py

# 3. Resolve first hole
python scripts/next_hole.py  # Shows H1 is ready
# Write tests/refactor/test_h1_*.py (fails initially)
# Refactor code until tests pass
python scripts/validate_resolution.py H1
python scripts/propagate.py H1
git commit -m "Resolve H1: ..."

# 4. Repeat for each hole
# ...

# 5. Generate report
python scripts/generate_report.py > REFACTOR_REPORT.md

Troubleshooting

Characterization tests fail

Symptom: Tests that captured baseline behavior now fail

Resolution:
1. Revert changes: git diff to see what changed
2. Investigate: What behavior changed and why?
3. Decision tree:
- Intentional change: Update baselines with documentation
python # Update baseline with reason save_baseline("v2_api", new_behavior, reason="Switched to async implementation")
- Unintentional regression: Fix the code, tests must pass

Prevention: Run characterization tests before AND after each hole resolution.

Hole can't be resolved

Symptom: Stuck on a hole for >3 days, unclear how to proceed

Resolution:
1. Check dependencies: Are they actually resolved?
bash python scripts/visualize_graph.py --analyze # Look for unresolved dependencies

Review constraints: Are they contradictory?
Example: C1 "preserve all behavior" + C5 "change API contract" → Contradictory
Fix: Renegotiate constraints with stakeholders
Split the hole: If hole is too large
bash # Original: R4_consolidate_all (Epic, 10+ days) # Split into: R4a_consolidate_parsers (Medium, 2 days) R4b_consolidate_validators (Small, 1 day) R4c_consolidate_handlers (Medium, 2 days)
Check for circular dependencies:
bash python scripts/visualize_graph.py # Look for cycles: R4 → R5 → R6 → R4
Fix: Break cycle by introducing intermediate hole or redefining dependencies

Escalation: If still stuck after 5 days, consider alternative refactoring approach.

Contradictory Constraints

Symptom: Cannot satisfy all constraints simultaneously

Example:
- C1: "Preserve exact current behavior" (backward compatibility)
- C5: "Reduce response time by 50%" (performance improvement)
- Current behavior includes slow, synchronous operations

Resolution Framework:

Identify the conflict:
markdown C1 requires: Keep synchronous operations C5 requires: Switch to async operations → Contradiction: Can't be both sync and async
Negotiate priorities:
| Option | C1 | C5 | Tradeoff |
|--------|----|----|----------|
| A: Keep sync | ✓ | ✗ | No performance gain |
| B: Switch to async | ✗ | ✓ | Breaking change |
| C: Add async, deprecate sync | ⚠️ | ✓ | Migration burden |
Choose resolution strategy:
Relax constraint: Change C1 to "Preserve behavior where possible"
Add migration period: C implemented over 2 releases
Split into phases: Phase 1 (C1), Phase 2 (C5)
Document decision:
```markdown
## Constraint Resolution: C1 vs C5

Decision: Relax C1 to allow async migration
Rationale: Performance critical for user experience
Migration: 3-month deprecation period for sync API
Approved by: [Stakeholder], [Date]
```

Circular Dependencies

Symptom: visualize_graph.py shows cycles

Example:

R4 (consolidate parsers) → depends on R6 (define interface)
R6 (define interface) → depends on R4 (needs parser examples)

Resolution strategies:

Introduce intermediate hole:
H0_parser_analysis: Analyze existing parsers (no dependencies) R6_interface: Define interface using H0 analysis R4_consolidate: Implement using R6 interface
Redefine dependencies:
Maybe R4 doesn't actually need R6
Or R6 only needs partial R4 (split R4)
Accept iterative refinement:
R6_interface_v1: Initial interface (simple) R4_consolidate: Implement with v1 interface R6_interface_v2: Refine based on R4 learnings

Prevention: Define architecture holes before implementation holes.

No Progress for 3+ Days

Symptom: Feeling stuck, no commits, uncertain how to proceed

Resolution checklist:

[ ] Review REFACTOR_IR.md: Are holes well-defined (SMART criteria)?
If not: Rewrite holes to be more specific
[ ] Check hole size: Is current hole >7 days estimate?
If yes: Split into smaller holes
[ ] Run dashboard: python scripts/check_completeness.py
Are you working on a blocked hole?
Switch to a ready hole instead
[ ] Visualize dependencies: python scripts/visualize_graph.py --analyze
Identify bottlenecks
Look for parallel work opportunities
[ ] Review constraints: Are they achievable?
Renegotiate if necessary
[ ] Seek external review:
Share REFACTOR_IR.md with colleague
Get feedback on approach
[ ] Consider alternative: Maybe this refactor isn't feasible
Document why
Propose different approach

Reset protocol: If still stuck, revert to last working state and try different approach.

Estimation Failures

Symptom: Hole taking 3x longer than estimated

Analysis:
1. Why did estimate fail?
- Underestimated complexity
- Unforeseen dependencies
- Unclear requirements
- Technical issues (tool problems, infrastructure)

Immediate actions:
Update REFACTOR_IR.md with revised estimate
If >7 days, split the hole
Update beads: bd update bd-5 --note "Revised estimate: 5 days (was 2)"
Future improvements:
Use actual times to calibrate future estimates
Add buffer for discovery (20% overhead)
Note uncertainty in IR: "Estimate: 2-4 days (high uncertainty)"

Learning: Track actual vs estimated time in REFACTOR_REPORT.md for future reference.

Begin with Phase 0: Discovery. Always work in a branch. Test first, refactor second.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.

typed-holes-refactor

# Description

# SKILL.md

Typed Holes Refactoring

Core Workflow

Phase 0: Hole Discovery & Setup

Phase 1-N: Iterative Hole Resolution

Phase Final: Reporting

Key Principles

1. Test-Driven Everything

2. Hole-Driven Progress

3. Continuous Validation

4. Safe by Construction

5. Formal Completeness

Hole Quality Framework

SMART Criteria for Good Holes

Hole Estimation Framework

Hole Splitting Guidelines

Common Hole Types

Architecture Holes

Implementation Holes

Quality Holes

Constraint Propagation Rules

Rule 1: Interface Resolution → Type Constraints

Rule 2: Implementation → Performance Constraints

Rule 3: Validation → Test Requirements

Success Indicators

Weekly Progress

Red Flags (Stop & Reassess)

Validation Gates

Claude-Assisted Workflow

Phase 0: Discovery

Phase 1-N: Hole Resolution

Phase Final: Reporting

Effective Prompting Patterns

Claude's Limitations

Multi-Session Continuity

Beads Integration

Setup

Workflow Integration

Scripts Reference

Meta-Consistency

Advanced Topics

Quick Start Example

Troubleshooting

Characterization tests fail

Hole can't be resolved

Contradictory Constraints

Circular Dependencies

No Progress for 3+ Days

Estimation Failures

# Related Skills

# Supported AI Coding Agents

Confirm

Submit a Skill