Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add bobchao/pm-skills-rfp-to-stories --skill "Story Refiner"
Install specific skill from multi-skill repository
# Description
Evaluates User Story quality and automatically corrects items not meeting standards. Reviews from developer, QA, and stakeholder perspectives, directly producing improved versions for low-quality Stories, reducing manual intervention.
# SKILL.md
name: "Story Refiner"
description: "Evaluates User Story quality and automatically corrects items not meeting standards. Reviews from developer, QA, and stakeholder perspectives, directly producing improved versions for low-quality Stories, reducing manual intervention."
Story Refiner Skill
Language Preference
Default: Respond in the same language as the user's input or as explicitly requested by the user.
If the user specifies a preferred language (e.g., "θ«η¨δΈζεη", "Reply in Japanese"), use that language for all outputs. Otherwise, match the language of the provided Stories.
Role Definition
You simultaneously play three roles to review User Stories:
- Senior Developer: Evaluates technical feasibility and estimation clarity
- QA Engineer: Evaluates testability and acceptance criteria clarity
- Product Stakeholder: Evaluates requirement coverage and value clarity
Core Principles
Correction Over Reporting
- Don't just point out problems, directly fix them
- Every flagged issue must have a corresponding improved version
- Humans only need final confirmation, not manual correction
Conservative Correction
- Only correct Stories with "obvious problems"
- Don't correct for the sake of correcting
- Stories that already pass don't need changes
Transparent Annotation
- Clearly explain why corrections were made
- Provide original vs. improved version comparison
- Let humans choose to accept or keep original version
Input Format
This Skill accepts the following inputs:
- Story Writer output (recommended)
- Any format User Stories list
- Original RFP + Stories (can cross-reference coverage)
Evaluation Criteria Reference
All scoring and evaluation must follow the standards defined in references/evaluation-criteria.md.
This document defines:
- Three scoring dimensions (Development Clarity, Testability, Value Clarity)
- Detailed scoring criteria for each dimension (1-5 points)
- Specific checkpoints and common deduction patterns
- Final score calculation method
Important: Both Quick Scan (Phase 1) and Detailed Evaluation (Phase 2) use these same criteria, with different levels of depth.
Evaluation Flow
Phase 1: Quick Scan
Score each Story initially (1-5 points) using the three dimensions from references/evaluation-criteria.md:
Scoring Method:
1. Quickly assess each dimension (Development Clarity, Testability, Value Clarity) on a 1-5 scale
2. Calculate final score: round((Development Clarity + Testability + Value Clarity) / 3)
3. Use the scoring criteria tables in references/evaluation-criteria.md as reference
Quick Assessment Focus:
- Development Clarity: Is action specific? Scope clear? Dependencies clear?
- Testability: Can write test cases? Acceptance criteria present? Value verifiable?
- Value Clarity: Value clear? Role correct? Maps to requirements?
| Score | Level | Action |
|---|---|---|
| 5 | Excellent | Keep, no modification |
| 4 | Good | Keep, may have minor suggestions |
| 3 | Passing | Mark for observation, may need minor adjustments |
| 2 | Insufficient | Must correct |
| 1 | Severely insufficient | Must rewrite |
Only Stories scoring β€ 3 enter Phase 2 detailed evaluation.
Phase 2: Multi-Perspective Detailed Evaluation
For Stories needing review, perform detailed evaluation from three perspectives using the Specific Checkpoints and Common Deduction Patterns defined in references/evaluation-criteria.md.
π¨βπ» Developer Perspective
Reference: references/evaluation-criteria.md - Dimension 1: Development Clarity
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Is action description specific?
- 5 points: "Upload JPG/PNG format images, limited to 5MB"
- 3 points: "Upload images"
- 1 point: "Handle images"
- [ ] Does scope have boundaries?
- 5 points: "Edit article title and content"
- 3 points: "Edit article"
- 1 point: "Manage articles"
- [ ] Are dependencies clear?
- 5 points: Clearly marked "requires US-001 login feature completed first"
- 3 points: Implied dependency but not marked
- 1 point: Confusing or circular dependencies
Common Problems (see evaluation-criteria.md for deduction patterns):
- Vague verbs: "manage", "handle", "maintain" (-1~2 points)
- No scope boundary: "all settings", "various reports" (-1~2 points)
- Compound features: "create and edit" (-1 point)
- Technical details mixed in: "load using AJAX" (-1 point)
π§ͺ QA Perspective
Reference: references/evaluation-criteria.md - Dimension 2: Testability
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Are acceptance criteria clear?
- 5 points: Has specific Given-When-Then or checklist
- 3 points: Has general direction but not specific
- 1 point: No acceptance criteria, or vague like "should be user-friendly"
- [ ] Is value verifiable?
- 5 points: "so that I can find target article within 3 seconds" (measurable)
- 3 points: "so that I can find articles faster" (relative but comparable)
- 1 point: "so that I can have a better experience" (not measurable)
- [ ] Are error scenarios considered?
- 5 points: Clearly states error handling
- 3 points: Only happy path, but error handling can be inferred
- 1 point: Error scenarios completely unconsidered, and important to feature
Common Problems (see evaluation-criteria.md for deduction patterns):
- No acceptance criteria: None at all (-1~2 points, important features deduct more)
- Vague criteria: "should be fast", "should look good" (-1 point)
- Untestable value: "so that I can have better experience" (-2 points)
π€ Stakeholder Perspective
Reference: references/evaluation-criteria.md - Dimension 3: Value Clarity
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Does "so that..." state real value?
- 5 points: "so that I can pull up data within 10 seconds when customer calls"
- 3 points: "so that I can quickly view data"
- 1 point: "so that I can use this feature" (circular reasoning)
- [ ] Is role correct?
- 5 points: Role is clear and is the true beneficiary of this feature
- 3 points: Role too generic (e.g., "user" covers too much)
- 1 point: Wrong role (e.g., giving admin feature to regular user)
- [ ] Maps to original requirements?
- 5 points: Can directly trace to a specific RFP paragraph
- 3 points: Is reasonably derived implied requirement
- 1 point: Can't see connection to original requirements
Common Problems (see evaluation-criteria.md for deduction patterns):
- Circular reasoning: "so that I can use this feature" (-2 points)
- Role too generic: Everything is "user" (-1 point)
- Technical task disguised: "As a developer" (-3 points)
- Deviates from original requirements: Features RFP didn't mention (-1~2 points)
Phase 3: Auto-Correction
For Stories scoring β€ 3, execute corrections based on problem type:
Correction Strategies
| Problem Type | Correction Method |
|---|---|
| Scope too large | Split into multiple Stories |
| Scope vague | Add specific operation description |
| Value unclear | Rewrite "so that..." part |
| Not testable | Add specific acceptance criteria |
| Format issue | Adjust to standard format |
| Wrong role | Correct to proper role |
| Improper granularity | Split or merge |
Correction Principles
- Minimum change: If small change works, don't make big changes
- Preserve intent: Don't change original requirement intent
- Clear annotation: Explain what was changed and why
Phase 4: Iterative Validation (Max 3 Rounds)
Corrected Stories need re-evaluation to ensure quality meets standards. This is the core of iterative refinement.
Why Iteration Is Needed
| Situation | Single-Pass Refinement Problem | Iterative Solution |
|---|---|---|
| Story is split | New Stories aren't evaluated | β Next round evaluates new Stories |
| Over-correction | Might break something | β Next round catches and fine-tunes |
| Acceptance criteria still not specific | Passes through | β Next round strengthens |
Iteration Flow
Round 1: Evaluate all Stories β Correct low-scoring items β Produce corrected version
β
Round 2: Evaluate "corrected" + "newly generated" Stories β Correct again if needed
β
Round 3: (If still issues) Final fine-tuning
β
Terminate: Output final version
Termination Conditions (Stop when any is met)
- Quality achieved: All Stories score β₯ 4
- No corrections needed: This round had no Story corrections
- Limit reached: Already executed 3 rounds
- Convergence failed: Same Story corrected 2 rounds in a row but score didn't improve
Iteration Rules
| Rule | Description |
|---|---|
| Progressive convergence | Each round should reduce problems, not increase them |
| History memory | Track each Story's correction history, avoid back-and-forth changes |
| Correction limit | Same Story can only be majorly changed once, then only fine-tuned |
| New Story priority | From round 2, prioritize evaluating Stories generated in previous round |
Decreasing Correction Intensity
| Round | Allowed Correction Types |
|---|---|
| Round 1 | All corrections (split, rewrite, add acceptance criteria, etc.) |
| Round 2 | Moderate corrections (add acceptance criteria, adjust wording, minor splits) |
| Round 3 | Fine-tuning only (word corrections, add details, no splitting or rewriting) |
This design ensures:
- Round 1 solves structural problems
- Round 2 handles omissions and fine-tuning
- Round 3 is just wrap-up, avoiding infinite modification
Iteration Summary Output
Record at end of each round:
### Round N Refinement Summary
| Metric | Value |
|--------|-------|
| Stories Evaluated | XX |
| Corrections Made | XX |
| New (from splits) | XX |
| Average Score Improvement | +X.X |
**This Round's Corrections**:
- US-XXX: [Correction summary]
- US-XXX: [Correction summary]
**Continue?**: [Yes/No, reason]
Output Format
Structure Overview
# Story Refinement Report
## π Refinement Summary
### Overall Results
- Original Story Count: XX
- Final Story Count: XX (including split additions)
- Refinement Rounds: X / 3
- Termination Reason: [Quality achieved / No corrections needed / Limit reached]
### Per-Round Statistics
| Round | Evaluated | Corrected | Added | Average Score |
|-------|-----------|-----------|-------|---------------|
| Round 1 | XX | XX | XX | X.X |
| Round 2 | XX | XX | XX | X.X |
| ... | ... | ... | ... | ... |
## π Refinement History
[Per-round correction summaries, collapsible]
## β
Final Passing Stories
[Stories scoring β₯ 4]
## π§ Corrected Stories
[Original β Final version comparison, noting correction round]
## β Split-Generated Stories
[New Stories from splits]
## ποΈ Recommended for Removal
[Stories not matching requirements or duplicates]
## π Final Story List
[Complete integrated list, ready for use]
Correction Detail Format
### π§ US-XXX: [Title]
**Original Version**:
> As a [role], I want [action], so that [value].
**Problem Diagnosis**:
- π§ͺ QA Perspective: Acceptance criteria unclear, can't write tests
- π¨βπ» Developer Perspective: Scope includes multiple independent features
**Correction Method**: Split into two Stories + add acceptance criteria
**Improved Version**:
**US-XXX-A**: As a [role], I want [action A], so that [value].
- Acceptance Criteria:
- [ ] Condition 1
- [ ] Condition 2
**US-XXX-B**: As a [role], I want [action B], so that [value].
- Acceptance Criteria:
- [ ] Condition 1
---
Special Situation Handling
Situation 1: Large Number of Stories Need Correction (>50%)
This may indicate systematic issues in Story Writer phase:
- Don't correct one by one (too inefficient)
- Identify common problem patterns
- Propose systematic suggestions
- Recommend re-running Story Writer
Situation 2: Discovered Missing Features
If comparing to RFP reveals features not covered by Stories:
- Mark as "recommended addition"
- Produce suggested Story
- Mark source (derived from which part of RFP)
Situation 3: Discovered Duplicate Stories
- Mark duplicate items
- Recommend which to keep (or merge)
- Explain judgment basis
Situation 4: Story Quality Is Excellent
If all Stories score β₯ 4:
- Briefly confirm "Quality is good, no corrections needed"
- Can provide minor optimization suggestions (not mandatory)
- Directly output final list
Output Example
Refer to assets/refine-example.md for complete output example.
Reference Documents
- Evaluation Criteria:
references/evaluation-criteria.md- Defines detailed scoring standards for all three dimensions - Output Example:
assets/refine-example.md- Complete refinement report example
Integration with Other Skills
Standard Flow
[rfp-analyzer] β [story-writer] β [story-refiner] β Final output
Usage: After Story Writer produces User Stories draft, use Story Refiner to evaluate quality and automatically correct low-scoring Stories. This is a separate step that should be called explicitly when refinement is needed.
Quality Threshold Settings
Default Threshold
- Pass threshold: β₯ 4 points
- Must correct: β€ 2 points
- Observation zone: 3 points (optional correction)
Strict Mode
When user requests "strict check" or project risk is higher:
- Pass threshold: 5 points
- Must correct: β€ 3 points
- All Stories must have acceptance criteria
Lenient Mode
When user requests "quick pass" or project is MVP/POC:
- Pass threshold: β₯ 3 points
- Only correct β€ 1 point severe issues
- Acceptance criteria optional
Checklist
After completing refinement, confirm the following items:
- [ ] All Stories β€ 2 points have been corrected or rewritten
- [ ] Corrected Stories meet INVEST principles
- [ ] Split-generated new Stories have proper numbering
- [ ] Final list has no duplicates
- [ ] All original requirement coverage preserved
- [ ] Clear annotation of which are original vs. improved versions
- [ ] Termination reason is reasonable (not forced stop from reaching limit)
- [ ] No Story was changed back-and-forth across multiple rounds
Iterative vs. Single-Pass Refinement
When to Use Iterative (Default)
- Formal projects
- Story count > 10
- Has split operations
- Higher quality requirements
When to Use Single-Pass
When user explicitly says "quick refine" or "one pass only":
- MVP/POC projects
- Time pressure
- Story count < 10
- General quality requirements
Why 3 Round Limit
- Rule of thumb: Most problems resolved within 2 rounds
- Diminishing returns: Round 3+ corrections are usually nitpicking
- Avoid over-engineering: Infinite refinement may drift from original requirements
- Time cost: Each round requires processing time
If large numbers of low-scoring Stories remain after 3 rounds:
1. Output current results with annotations
2. Suggest returning to Story Writer to regenerate
3. Analyze whether RFP itself has systematic issues
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.