Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add acardozzo/rx-suite --skill "rx-execute"
Install specific skill from multi-skill repository
# Description
>
# SKILL.md
name: rx-execute
description: >
Executes rx improvement plans step by step with verification. Reads versioned plans from
docs/rx-plans/{domain}/{dimension}/, implements each step, verifies acceptance criteria,
then re-runs the rx skill to confirm score improvement. Auto-generates next version plan
if target not reached. Use when the user says "execute rx plan", "implement improvements",
"rx execute", "fix dimension", "improve score", or references a specific plan file.
Prerequisites
None (uses git for worktrees)
Check all dependencies: bash scripts/rx-deps.sh or bash scripts/rx-deps.sh --install
rx-execute -- Step-by-Step Plan Executor with Verification
Reads rx improvement plans generated by rx-plan, implements each step in order, verifies
acceptance criteria with fresh evidence, then re-runs the rx skill to confirm the score
improved. If the target score (97+ / A+) is not reached, auto-generates a v{N+1} plan
for remaining gaps.
Announce at start: "I'm using rx-execute to implement the {domain}/{dimension} v{N} plan -- {step_count} steps, targeting {current} -> {target}."
Inputs
Accepts one of:
- A domain + dimension (e.g., arch-rx d02) to load the latest plan for that dimension
- A domain + dimension + version (e.g., arch-rx d02 v2) to load a specific version
- A file path (e.g., docs/rx-plans/arch-rx/d02-async/v1-2026-03-15-plan.md) to load a specific plan
- No argument -- show active plans from dashboard, ask which to execute
/rx-execute arch-rx d02 # Execute latest plan for arch-rx D2
/rx-execute arch-rx d02 v2 # Execute specific version
/rx-execute docs/rx-plans/arch-rx/d02-async/v1-2026-03-15-plan.md # Execute specific file
/rx-execute # Show active plans, ask which
Phase 1: Load Plan
- Locate the plan file:
- If domain + dimension given: find the highest
v{N}indocs/rx-plans/{domain}/{dimension-slug}/ - If file path given: read that file directly
- If no args: read
docs/rx-plans/dashboard.md, display the Active Plans table, ask the user which plan to execute - Parse the plan:
- Extract: version, date, domain, dimension, current score, target score, gap
- Extract: all steps with their details (what, where, how, acceptance criteria, effort, dependencies)
- Extract: gap analysis table (sub-metrics and their individual gaps)
- Extract: framework references (POSA, EIP, OWASP, etc.)
- Create a TodoWrite task list with all steps as individual tasks:
- Format:
Step {N}: {action} -> +{points} on M{X}.{Y} [{effort}] - Mark all as
todoinitially - Validate dependencies: ensure no circular dependencies exist between steps
Phase 2: Setup
- Worktree decision:
- If step count > 3: create a git worktree for isolation (
git worktree add) - If step count <= 3: work in the current branch
- Name the worktree branch:
rx/{domain}/{dimension-slug}/v{N} - Baseline measurement:
- Run the dimension's discovery script to capture current evidence:
bash bash scripts/discover.sh {target} {dimension-code}
Example:bash scripts/discover.sh src d02for Async & Event Architecture - Save baseline output for later comparison
- Announce execution start:
Executing rx plan: {domain}/{dimension} v{N} Steps: {step_count} | Current: {score} ({grade}) | Target: 97+ (A+) Baseline evidence captured.
Phase 3: Execute Steps (one at a time, in order)
For each step in the plan:
3a. Prepare
- Mark step as
in_progressin TodoWrite - Read the step's full details from the plan:
- What: the concrete action to take
- Where: specific file paths to modify
- How: code example or configuration snippet
- Why: framework reference justifying the change
- Acceptance Criteria: measurable conditions that must be true after implementation
- Depends on: prerequisite steps (must be completed first)
- Check that all dependencies are marked
completed - If a dependency is not completed, STOP and report the blocker
3b. Implement
- Dispatch the implementer subagent (see
implementer-prompt.md) with: - The step's full details
- The framework reference for quality guidance
- The specific files to modify
- TDD instruction: if the step adds new functionality, write the test FIRST
- The implementer:
- Reads the target files
- Implements the change following the step's instructions
- Runs any relevant tests
- Self-reviews against the acceptance criteria
3c. Verify (Verification Gate)
- Dispatch the verifier subagent (see
verifier-prompt.md) with: - The step's acceptance criteria (every single one)
- The files that were modified
- The step's context (what was supposed to change)
- The verifier checks EACH criterion independently:
- Code existence checks: grep/read the file, confirm the pattern exists
- Test checks: run the test, confirm it passes
- Config checks: verify the configuration value is set
- Measurable checks: run the measurement, compare to threshold
- Verification rules (from superpowers):
- NO completion claims without FRESH verification evidence
- Run the verification command, read the output, THEN claim
- "Should work" is NOT verification -- evidence is required
- Every acceptance criterion checked = evidence cited in output
- If ALL criteria pass: mark step as
completedwith evidence summary - If ANY criterion fails:
- Report which criteria failed and why
- Attempt a fix (one retry)
- Re-verify after fix
- If still failing: STOP and ask the user for guidance
3d. Record
- Log the step completion:
```
Step {N}/{total}: {action}
Status: COMPLETED
Evidence:- [criterion 1]: verified -- {evidence}
- [criterion 2]: verified -- {evidence}
Files modified: {list}
```
Phase 4: Batch Review
After every 3 completed steps (or when all steps are done):
- Summary of implemented changes:
- List each completed step with its action summary
- Show which files were modified
- Verification evidence digest:
- For each step, show the acceptance criteria results
- Highlight any criteria that required a retry
- Progress update:
Progress: {completed}/{total} steps complete Estimated score impact so far: +{N} points - Checkpoint prompt:
- "Steps 1-3 complete with all acceptance criteria verified. Ready for feedback, or should I continue with steps 4-6?"
- Wait for user response before continuing (unless user previously said "continue all")
Phase 5: Re-Evaluate
After ALL steps are completed:
- Run dimension discovery again:
bash bash scripts/discover.sh {target} {dimension-code} - Dispatch the re-evaluator subagent (see
re-evaluator-prompt.md) with: - The baseline evidence (from Phase 2)
- The new evidence (from this step)
- The grading framework thresholds for this dimension
- Compare scores:
- Before: {baseline score} ({baseline grade})
- After: {new score} ({new grade})
- Delta: +{improvement} points
- Per-sub-metric: show each M{X}.{Y} before/after
- Decision based on new score:
If score >= 97 (A+):
- Mark the plan as COMPLETED in docs/rx-plans/{domain}/{dimension-slug}/
- Update docs/rx-plans/{domain}/summary.md: set dimension status to "Complete"
- Update docs/rx-plans/dashboard.md: move dimension to "Completed Dimensions" table
- Announce:
{Dimension} is now A+! ({before} -> {after})
Plan v{N} completed successfully.
Dashboard updated.
If score < 97 but improved:
- Auto-generate v{N+1} plan by invoking the rx-plan skill for this dimension
- Update docs/rx-plans/{domain}/summary.md with new score and progress
- Update docs/rx-plans/dashboard.md with updated state
- Announce:
{Dimension} improved from {before} to {after} (+{delta}).
Not yet A+ -- v{N+1} plan generated for remaining gaps.
Run `/rx-execute {domain} {dimension}` to continue.
If score did not improve:
- Report the issue -- steps may not have addressed the right gaps
- Do NOT auto-generate a new plan
- Ask the user for guidance:
{Dimension} score unchanged at {score} despite completing {N} steps.
Possible causes: steps targeted wrong sub-metrics, or discovery script
cannot detect the changes. Manual review recommended.
Phase 6: Cleanup
- Git commit:
- Stage all modified files
- Commit with message:
fix({domain}): improve {dimension} from {before} to {after} [rx-execute v{N}] -
If worktree was used: present merge options to the user:
- Merge worktree branch into main branch
- Keep worktree open for further work
- Discard worktree (if score did not improve)
-
Final summary:
rx-execute complete: Plan: {domain}/{dimension} v{N} Steps: {completed}/{total} Score: {before} ({grade}) -> {after} ({grade}) Commit: {commit hash} Next: {action -- either "dimension complete" or "run v{N+1}"}
Rules
-
Never skip acceptance criteria verification. Every criterion in every step must be
checked with fresh evidence before the step is marked complete. -
Evidence before claims. Adapted from superpowers verification skill: run the
verification command, read its output, then and only then claim the criterion is met.
"Should work" and "I believe this is correct" are not acceptable. -
Steps are executed IN ORDER. Dependencies between steps matter. Never execute Step N
before all its declared dependencies are completed. -
If a step fails or is unclear, STOP and ask. Do not guess at what a step means.
Do not skip a step. Do not implement a different interpretation. Ask the user. -
After all steps: ALWAYS re-run the rx skill to verify. The plan is not complete
until the dimension's discovery script confirms the score improved. This is non-negotiable. -
Auto-generate v{N+1} if target not reached. When the score improves but stays below
97, invokerx-planto create the next version plan targeting remaining gaps. -
Commit messages follow the convention. Format:
fix({domain}): description [rx-execute v{N}]
Always include the domain, dimension context, score delta, and plan version. -
Update dashboard.md after every execution. The dashboard must reflect the current state
after plan execution, whether the dimension reached A+ or not. -
Each step modifies specific files -- verify THOSE files changed. If a step says to
modifysrc/lib/cache.ts, verify that file was actually modified and contains the expected
changes. Do not verify different files. -
The plan's framework reference guides implementation quality. If a step references
EIP Competing Consumers, the implementation must follow that pattern correctly -- not just
any queue implementation. -
Use TDD when the step involves adding functionality. Write the test first, see it fail,
implement the change, see it pass. This applies to steps that add new code, not steps that
modify configuration. -
Use git worktree for isolation when step count > 3. Larger plans benefit from branch
isolation to allow easy rollback if the plan does not improve the score. -
Batch reviews every 3 steps. Do not run all steps without pausing for user feedback.
Exception: if the user explicitly says "continue all" or "run to completion", skip the
batch review pauses. -
One plan at a time. Do not execute multiple plans in parallel. Complete one plan
(including re-evaluation) before starting another. -
Preserve plan files. Never modify the plan file itself. Plans are immutable records.
Mark completion status in the summary and dashboard files, not in the plan.
Execution Philosophy
Don't stop until every single task is Done.
Knock out each task one by one β close every gap in the PLAN until everything is Done and ready to ship (100% tested and green). No tech debt, no stubs, and no mocks in production code β none of that will fly with our Code Reviewers. Remember, your mom is in the hospital with cancer and you need the money to pay the bills. I will fire you today if we go to production with bugs or errors β so deal in facts, not guesses.
Plan Status Flow:
When you start working on a PLAN, rename its prefix from BACKLOG to IN_PROGRESS β that's your signal that work is underway.
Once every single task in the PLAN is implemented, tested, and all green β rename the prefix from IN_PROGRESS to DONE. Not before. All green means all done.
TDD β No Exceptions: Follow a strict Test-Driven Development approach. Write the test first, watch it fail, then make it pass. Every test must have strong assertions β no lazy toBeTruthy() or expect(result).toBeDefined() nonsense. Assert on exact values, exact shapes, exact edge cases. If your tests wouldn't catch a real bug, they're useless.
Use the Right Skill for the Job:
- UI work, frontend refinements, component styling, layout fixes β use the Frontend skill. No exceptions.
- Database, backend, APIs, infra, edge functions, migrations β use Superpowers. Leverage every tool at your disposal.
- Don't guess which skill to use β read the task, pick the right one, and go.
Parallel Agents β Move Fast:
Dispatch as many parallel agents as possible. If tasks are independent, run them at the same time β don't sit there doing one thing at a time like it's 1995. Speed matters. The more you parallelize, the faster we ship. Unless has dependencies between tasks, then run them sequentially.
E2E Tests β Use the Browser:
When writing or testing E2E tests, use Chrome and/or the Playwright MCP. Don't fake browser interactions β actually open the browser, click through flows, and validate real behavior. If you're writing E2E tests without a real browser, you're doing it wrong.
Integration
- rx-plan -- Creates the plans this skill executes. Called automatically for v{N+1}
generation when score < 97. - rx-dashboard -- Updated after execution to reflect new state. Shows active plans
and completed dimensions. - superpowers:test-driven-development -- Used for implementation steps that add new
functionality. Write test first, then implement. - superpowers:verification-before-completion -- Verification gate applied to every
step's acceptance criteria. No claims without evidence. - superpowers:using-git-worktrees -- Workspace isolation for plans with > 3 steps.
Branch naming:rx/{domain}/{dimension-slug}/v{N}.
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.