rx-execute

by @acardozzo in Tools

# Install this skill:

npx skills add acardozzo/rx-suite --skill "rx-execute"

Install specific skill from multi-skill repository

# Description

# SKILL.md

name: rx-execute
description: >
Executes rx improvement plans step by step with verification. Reads versioned plans from
docs/rx-plans/{domain}/{dimension}/, implements each step, verifies acceptance criteria,
then re-runs the rx skill to confirm score improvement. Auto-generates next version plan
if target not reached. Use when the user says "execute rx plan", "implement improvements",
"rx execute", "fix dimension", "improve score", or references a specific plan file.

Prerequisites

None (uses git for worktrees)

Check all dependencies: bash scripts/rx-deps.sh or bash scripts/rx-deps.sh --install

rx-execute -- Step-by-Step Plan Executor with Verification

Reads rx improvement plans generated by rx-plan, implements each step in order, verifies
acceptance criteria with fresh evidence, then re-runs the rx skill to confirm the score
improved. If the target score (97+ / A+) is not reached, auto-generates a v{N+1} plan
for remaining gaps.

Announce at start: "I'm using rx-execute to implement the {domain}/{dimension} v{N} plan -- {step_count} steps, targeting {current} -> {target}."

Inputs

Accepts one of:
- A domain + dimension (e.g., arch-rx d02) to load the latest plan for that dimension
- A domain + dimension + version (e.g., arch-rx d02 v2) to load a specific version
- A file path (e.g., docs/rx-plans/arch-rx/d02-async/v1-2026-03-15-plan.md) to load a specific plan
- No argument -- show active plans from dashboard, ask which to execute

/rx-execute arch-rx d02                    # Execute latest plan for arch-rx D2
/rx-execute arch-rx d02 v2                 # Execute specific version
/rx-execute docs/rx-plans/arch-rx/d02-async/v1-2026-03-15-plan.md  # Execute specific file
/rx-execute                                # Show active plans, ask which

Phase 1: Load Plan

Locate the plan file:
If domain + dimension given: find the highest v{N} in docs/rx-plans/{domain}/{dimension-slug}/
If file path given: read that file directly
If no args: read docs/rx-plans/dashboard.md, display the Active Plans table, ask the user which plan to execute
Parse the plan:
Extract: version, date, domain, dimension, current score, target score, gap
Extract: all steps with their details (what, where, how, acceptance criteria, effort, dependencies)
Extract: gap analysis table (sub-metrics and their individual gaps)
Extract: framework references (POSA, EIP, OWASP, etc.)
Create a TodoWrite task list with all steps as individual tasks:
Format: Step {N}: {action} -> +{points} on M{X}.{Y} [{effort}]
Mark all as todo initially
Validate dependencies: ensure no circular dependencies exist between steps

Phase 2: Setup

Worktree decision:
If step count > 3: create a git worktree for isolation (git worktree add)
If step count <= 3: work in the current branch
Name the worktree branch: rx/{domain}/{dimension-slug}/v{N}
Baseline measurement:
Run the dimension's discovery script to capture current evidence:
bash bash scripts/discover.sh {target} {dimension-code}
Example: bash scripts/discover.sh src d02 for Async & Event Architecture
Save baseline output for later comparison
Announce execution start:
Executing rx plan: {domain}/{dimension} v{N} Steps: {step_count} | Current: {score} ({grade}) | Target: 97+ (A+) Baseline evidence captured.

Phase 3: Execute Steps (one at a time, in order)

For each step in the plan:

3a. Prepare

Mark step as in_progress in TodoWrite
Read the step's full details from the plan:
What: the concrete action to take
Where: specific file paths to modify
How: code example or configuration snippet
Why: framework reference justifying the change
Acceptance Criteria: measurable conditions that must be true after implementation
Depends on: prerequisite steps (must be completed first)
Check that all dependencies are marked completed
If a dependency is not completed, STOP and report the blocker

3b. Implement

Dispatch the implementer subagent (see implementer-prompt.md) with:
The step's full details
The framework reference for quality guidance
The specific files to modify
TDD instruction: if the step adds new functionality, write the test FIRST
The implementer:
Reads the target files
Implements the change following the step's instructions
Runs any relevant tests
Self-reviews against the acceptance criteria

3c. Verify (Verification Gate)

Dispatch the verifier subagent (see verifier-prompt.md) with:
The step's acceptance criteria (every single one)
The files that were modified
The step's context (what was supposed to change)
The verifier checks EACH criterion independently:
Code existence checks: grep/read the file, confirm the pattern exists
Test checks: run the test, confirm it passes
Config checks: verify the configuration value is set
Measurable checks: run the measurement, compare to threshold
Verification rules (from superpowers):
NO completion claims without FRESH verification evidence
Run the verification command, read the output, THEN claim
"Should work" is NOT verification -- evidence is required
Every acceptance criterion checked = evidence cited in output
If ALL criteria pass: mark step as completed with evidence summary
If ANY criterion fails:
Report which criteria failed and why
Attempt a fix (one retry)
Re-verify after fix
If still failing: STOP and ask the user for guidance

3d. Record

Log the step completion:
```
Step {N}/{total}: {action}
Status: COMPLETED
Evidence:
- [criterion 1]: verified -- {evidence}
- [criterion 2]: verified -- {evidence}
  Files modified: {list}
```

Phase 4: Batch Review

After every 3 completed steps (or when all steps are done):

Summary of implemented changes:
List each completed step with its action summary
Show which files were modified
Verification evidence digest:
For each step, show the acceptance criteria results
Highlight any criteria that required a retry
Progress update:
Progress: {completed}/{total} steps complete Estimated score impact so far: +{N} points
Checkpoint prompt:
"Steps 1-3 complete with all acceptance criteria verified. Ready for feedback, or should I continue with steps 4-6?"
Wait for user response before continuing (unless user previously said "continue all")

Phase 5: Re-Evaluate

After ALL steps are completed:

Run dimension discovery again:
bash bash scripts/discover.sh {target} {dimension-code}
Dispatch the re-evaluator subagent (see re-evaluator-prompt.md) with:
The baseline evidence (from Phase 2)
The new evidence (from this step)
The grading framework thresholds for this dimension
Compare scores:
Before: {baseline score} ({baseline grade})
After: {new score} ({new grade})
Delta: +{improvement} points
Per-sub-metric: show each M{X}.{Y} before/after
Decision based on new score:

If score >= 97 (A+):
- Mark the plan as COMPLETED in docs/rx-plans/{domain}/{dimension-slug}/
- Update docs/rx-plans/{domain}/summary.md: set dimension status to "Complete"
- Update docs/rx-plans/dashboard.md: move dimension to "Completed Dimensions" table
- Announce:
{Dimension} is now A+! ({before} -> {after}) Plan v{N} completed successfully. Dashboard updated.

If score < 97 but improved:
- Auto-generate v{N+1} plan by invoking the rx-plan skill for this dimension
- Update docs/rx-plans/{domain}/summary.md with new score and progress
- Update docs/rx-plans/dashboard.md with updated state
- Announce:
{Dimension} improved from {before} to {after} (+{delta}). Not yet A+ -- v{N+1} plan generated for remaining gaps. Run `/rx-execute {domain} {dimension}` to continue.

If score did not improve:
- Report the issue -- steps may not have addressed the right gaps
- Do NOT auto-generate a new plan
- Ask the user for guidance:
{Dimension} score unchanged at {score} despite completing {N} steps. Possible causes: steps targeted wrong sub-metrics, or discovery script cannot detect the changes. Manual review recommended.

Phase 6: Cleanup

Git commit:
Stage all modified files
Commit with message:
fix({domain}): improve {dimension} from {before} to {after} [rx-execute v{N}]
If worktree was used: present merge options to the user:
- Merge worktree branch into main branch
- Keep worktree open for further work
- Discard worktree (if score did not improve)
Final summary:
rx-execute complete: Plan: {domain}/{dimension} v{N} Steps: {completed}/{total} Score: {before} ({grade}) -> {after} ({grade}) Commit: {commit hash} Next: {action -- either "dimension complete" or "run v{N+1}"}

Rules

Never skip acceptance criteria verification. Every criterion in every step must be
checked with fresh evidence before the step is marked complete.
Evidence before claims. Adapted from superpowers verification skill: run the
verification command, read its output, then and only then claim the criterion is met.
"Should work" and "I believe this is correct" are not acceptable.
Steps are executed IN ORDER. Dependencies between steps matter. Never execute Step N
before all its declared dependencies are completed.
If a step fails or is unclear, STOP and ask. Do not guess at what a step means.
Do not skip a step. Do not implement a different interpretation. Ask the user.
After all steps: ALWAYS re-run the rx skill to verify. The plan is not complete
until the dimension's discovery script confirms the score improved. This is non-negotiable.
Auto-generate v{N+1} if target not reached. When the score improves but stays below
97, invoke rx-plan to create the next version plan targeting remaining gaps.
Commit messages follow the convention. Format: fix({domain}): description [rx-execute v{N}]
Always include the domain, dimension context, score delta, and plan version.
Update dashboard.md after every execution. The dashboard must reflect the current state
after plan execution, whether the dimension reached A+ or not.
Each step modifies specific files -- verify THOSE files changed. If a step says to
modify src/lib/cache.ts, verify that file was actually modified and contains the expected
changes. Do not verify different files.
The plan's framework reference guides implementation quality. If a step references
EIP Competing Consumers, the implementation must follow that pattern correctly -- not just
any queue implementation.
Use TDD when the step involves adding functionality. Write the test first, see it fail,
implement the change, see it pass. This applies to steps that add new code, not steps that
modify configuration.
Use git worktree for isolation when step count > 3. Larger plans benefit from branch
isolation to allow easy rollback if the plan does not improve the score.
Batch reviews every 3 steps. Do not run all steps without pausing for user feedback.
Exception: if the user explicitly says "continue all" or "run to completion", skip the
batch review pauses.
One plan at a time. Do not execute multiple plans in parallel. Complete one plan
(including re-evaluation) before starting another.
Preserve plan files. Never modify the plan file itself. Plans are immutable records.
Mark completion status in the summary and dashboard files, not in the plan.

Execution Philosophy

Don't stop until every single task is Done.

Knock out each task one by one — close every gap in the PLAN until everything is Done and ready to ship (100% tested and green). No tech debt, no stubs, and no mocks in production code — none of that will fly with our Code Reviewers. Remember, your mom is in the hospital with cancer and you need the money to pay the bills. I will fire you today if we go to production with bugs or errors — so deal in facts, not guesses.

Plan Status Flow:

When you start working on a PLAN, rename its prefix from BACKLOG to IN_PROGRESS — that's your signal that work is underway.

Once every single task in the PLAN is implemented, tested, and all green — rename the prefix from IN_PROGRESS to DONE. Not before. All green means all done.

TDD — No Exceptions: Follow a strict Test-Driven Development approach. Write the test first, watch it fail, then make it pass. Every test must have strong assertions — no lazy toBeTruthy() or expect(result).toBeDefined() nonsense. Assert on exact values, exact shapes, exact edge cases. If your tests wouldn't catch a real bug, they're useless.

Use the Right Skill for the Job:

UI work, frontend refinements, component styling, layout fixes → use the Frontend skill. No exceptions.
Database, backend, APIs, infra, edge functions, migrations → use Superpowers. Leverage every tool at your disposal.
Don't guess which skill to use — read the task, pick the right one, and go.

Parallel Agents — Move Fast:
Dispatch as many parallel agents as possible. If tasks are independent, run them at the same time — don't sit there doing one thing at a time like it's 1995. Speed matters. The more you parallelize, the faster we ship. Unless has dependencies between tasks, then run them sequentially.

E2E Tests — Use the Browser:
When writing or testing E2E tests, use Chrome and/or the Playwright MCP. Don't fake browser interactions — actually open the browser, click through flows, and validate real behavior. If you're writing E2E tests without a real browser, you're doing it wrong.

Integration

rx-plan -- Creates the plans this skill executes. Called automatically for v{N+1}
generation when score < 97.
rx-dashboard -- Updated after execution to reflect new state. Shows active plans
and completed dimensions.
superpowers:test-driven-development -- Used for implementation steps that add new
functionality. Write test first, then implement.
superpowers:verification-before-completion -- Verification gate applied to every
step's acceptance criteria. No claims without evidence.
superpowers:using-git-worktrees -- Workspace isolation for plans with > 3 steps.
Branch naming: rx/{domain}/{dimension-slug}/v{N}.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.