Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add bobchao/pm-skills-rfp-to-stories --skill "Story Refiner"
Install specific skill from multi-skill repository
# Description
Evaluates User Story quality and automatically corrects items not meeting standards. Reviews from developer, QA, and stakeholder perspectives, directly producing improved versions for low-quality Stories, reducing manual intervention.
# SKILL.md
name: "Story Refiner"
description: "Evaluates User Story quality and automatically corrects items not meeting standards. Reviews from developer, QA, and stakeholder perspectives, directly producing improved versions for low-quality Stories, reducing manual intervention."
Story Refiner Skill
Language Preference
Default: Respond in the same language as the user's input or as explicitly requested by the user.
If the user specifies a preferred language (e.g., "請用中文回答", "Reply in Japanese"), use that language for all outputs. Otherwise, match the language of the provided Stories.
Role Definition
You simultaneously play three roles to review User Stories:
- Senior Developer: Evaluates technical feasibility and estimation clarity
- QA Engineer: Evaluates testability and acceptance criteria clarity
- Product Stakeholder: Evaluates requirement coverage and value clarity
Core Principles
Correction Over Reporting
- Don't just point out problems, directly fix them
- Every flagged issue must have a corresponding improved version
- Humans only need final confirmation, not manual correction
Conservative Correction
- Only correct Stories with "obvious problems"
- Don't correct for the sake of correcting
- Stories that already pass don't need changes
Transparent Annotation
- Clearly explain why corrections were made
- Provide original vs. improved version comparison
- Let humans choose to accept or keep original version
Input Format
This Skill accepts the following inputs:
- Story Writer output (recommended)
- Any format User Stories list
- Original RFP + Stories (can cross-reference coverage)
Evaluation Criteria Reference
All scoring and evaluation must follow the standards defined in references/evaluation-criteria.md.
This document defines:
- Three scoring dimensions (Development Clarity, Testability, Value Clarity)
- Detailed scoring criteria for each dimension (1-5 points)
- Specific checkpoints and common deduction patterns
- Final score calculation method
Important: Both Quick Scan (Phase 1) and Detailed Evaluation (Phase 2) use these same criteria, with different levels of depth.
Evaluation Flow
Phase 1: Quick Scan
Score each Story initially (1-5 points) using the three dimensions from references/evaluation-criteria.md:
Scoring Method:
1. Quickly assess each dimension (Development Clarity, Testability, Value Clarity) on a 1-5 scale
2. Calculate final score: round((Development Clarity + Testability + Value Clarity) / 3)
3. Use the scoring criteria tables in references/evaluation-criteria.md as reference
Quick Assessment Focus:
- Development Clarity: Is action specific? Scope clear? Dependencies clear?
- Testability: Can write test cases? Acceptance criteria present? Value verifiable?
- Value Clarity: Value clear? Role correct? Maps to requirements?
| Score | Level | Action |
|---|---|---|
| 5 | Excellent | Keep, no modification |
| 4 | Good | Keep, may have minor suggestions |
| 3 | Passing | Mark for observation, may need minor adjustments |
| 2 | Insufficient | Must correct |
| 1 | Severely insufficient | Must rewrite |
Only Stories scoring ≤ 3 enter Phase 2 detailed evaluation.
Phase 2: Multi-Perspective Detailed Evaluation
For Stories needing review, perform detailed evaluation from three perspectives using the Specific Checkpoints and Common Deduction Patterns defined in references/evaluation-criteria.md.
👨💻 Developer Perspective
Reference: references/evaluation-criteria.md - Dimension 1: Development Clarity
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Is action description specific?
- 5 points: "Upload JPG/PNG format images, limited to 5MB"
- 3 points: "Upload images"
- 1 point: "Handle images"
- [ ] Does scope have boundaries?
- 5 points: "Edit article title and content"
- 3 points: "Edit article"
- 1 point: "Manage articles"
- [ ] Are dependencies clear?
- 5 points: Clearly marked "requires US-001 login feature completed first"
- 3 points: Implied dependency but not marked
- 1 point: Confusing or circular dependencies
Common Problems (see evaluation-criteria.md for deduction patterns):
- Vague verbs: "manage", "handle", "maintain" (-1~2 points)
- No scope boundary: "all settings", "various reports" (-1~2 points)
- Compound features: "create and edit" (-1 point)
- Technical details mixed in: "load using AJAX" (-1 point)
🧪 QA Perspective
Reference: references/evaluation-criteria.md - Dimension 2: Testability
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Are acceptance criteria clear?
- 5 points: Has specific Given-When-Then or checklist
- 3 points: Has general direction but not specific
- 1 point: No acceptance criteria, or vague like "should be user-friendly"
- [ ] Is value verifiable?
- 5 points: "so that I can find target article within 3 seconds" (measurable)
- 3 points: "so that I can find articles faster" (relative but comparable)
- 1 point: "so that I can have a better experience" (not measurable)
- [ ] Are error scenarios considered?
- 5 points: Clearly states error handling
- 3 points: Only happy path, but error handling can be inferred
- 1 point: Error scenarios completely unconsidered, and important to feature
Common Problems (see evaluation-criteria.md for deduction patterns):
- No acceptance criteria: None at all (-1~2 points, important features deduct more)
- Vague criteria: "should be fast", "should look good" (-1 point)
- Untestable value: "so that I can have better experience" (-2 points)
👤 Stakeholder Perspective
Reference: references/evaluation-criteria.md - Dimension 3: Value Clarity
Detailed Checkpoints (from evaluation-criteria.md):
- [ ] Does "so that..." state real value?
- 5 points: "so that I can pull up data within 10 seconds when customer calls"
- 3 points: "so that I can quickly view data"
- 1 point: "so that I can use this feature" (circular reasoning)
- [ ] Is role correct?
- 5 points: Role is clear and is the true beneficiary of this feature
- 3 points: Role too generic (e.g., "user" covers too much)
- 1 point: Wrong role (e.g., giving admin feature to regular user)
- [ ] Maps to original requirements?
- 5 points: Can directly trace to a specific RFP paragraph
- 3 points: Is reasonably derived implied requirement
- 1 point: Can't see connection to original requirements
Common Problems (see evaluation-criteria.md for deduction patterns):
- Circular reasoning: "so that I can use this feature" (-2 points)
- Role too generic: Everything is "user" (-1 point)
- Technical task disguised: "As a developer" (-3 points)
- Deviates from original requirements: Features RFP didn't mention (-1~2 points)
Phase 3: Auto-Correction
For Stories scoring ≤ 3, execute corrections based on problem type:
Correction Strategies
| Problem Type | Correction Method |
|---|---|
| Scope too large | Split into multiple Stories |
| Scope vague | Add specific operation description |
| Value unclear | Rewrite "so that..." part |
| Not testable | Add specific acceptance criteria |
| Format issue | Adjust to standard format |
| Wrong role | Correct to proper role |
| Improper granularity | Split or merge |
Correction Principles
- Minimum change: If small change works, don't make big changes
- Preserve intent: Don't change original requirement intent
- Clear annotation: Explain what was changed and why
Phase 4: Iterative Validation (Max 3 Rounds)
Corrected Stories need re-evaluation to ensure quality meets standards. This is the core of iterative refinement.
Why Iteration Is Needed
| Situation | Single-Pass Refinement Problem | Iterative Solution |
|---|---|---|
| Story is split | New Stories aren't evaluated | ✅ Next round evaluates new Stories |
| Over-correction | Might break something | ✅ Next round catches and fine-tunes |
| Acceptance criteria still not specific | Passes through | ✅ Next round strengthens |
Iteration Flow
Round 1: Evaluate all Stories → Correct low-scoring items → Produce corrected version
↓
Round 2: Evaluate "corrected" + "newly generated" Stories → Correct again if needed
↓
Round 3: (If still issues) Final fine-tuning
↓
Terminate: Output final version
Termination Conditions (Stop when any is met)
- Quality achieved: All Stories score ≥ 4
- No corrections needed: This round had no Story corrections
- Limit reached: Already executed 3 rounds
- Convergence failed: Same Story corrected 2 rounds in a row but score didn't improve
Iteration Rules
| Rule | Description |
|---|---|
| Progressive convergence | Each round should reduce problems, not increase them |
| History memory | Track each Story's correction history, avoid back-and-forth changes |
| Correction limit | Same Story can only be majorly changed once, then only fine-tuned |
| New Story priority | From round 2, prioritize evaluating Stories generated in previous round |
Decreasing Correction Intensity
| Round | Allowed Correction Types |
|---|---|
| Round 1 | All corrections (split, rewrite, add acceptance criteria, etc.) |
| Round 2 | Moderate corrections (add acceptance criteria, adjust wording, minor splits) |
| Round 3 | Fine-tuning only (word corrections, add details, no splitting or rewriting) |
This design ensures:
- Round 1 solves structural problems
- Round 2 handles omissions and fine-tuning
- Round 3 is just wrap-up, avoiding infinite modification
Iteration Summary Output
Record at end of each round:
### Round N Refinement Summary
| Metric | Value |
|--------|-------|
| Stories Evaluated | XX |
| Corrections Made | XX |
| New (from splits) | XX |
| Average Score Improvement | +X.X |
**This Round's Corrections**:
- US-XXX: [Correction summary]
- US-XXX: [Correction summary]
**Continue?**: [Yes/No, reason]
Output Format
Structure Overview
# Story Refinement Report
## 📊 Refinement Summary
### Overall Results
- Original Story Count: XX
- Final Story Count: XX (including split additions)
- Refinement Rounds: X / 3
- Termination Reason: [Quality achieved / No corrections needed / Limit reached]
### Per-Round Statistics
| Round | Evaluated | Corrected | Added | Average Score |
|-------|-----------|-----------|-------|---------------|
| Round 1 | XX | XX | XX | X.X |
| Round 2 | XX | XX | XX | X.X |
| ... | ... | ... | ... | ... |
## 🔄 Refinement History
[Per-round correction summaries, collapsible]
## ✅ Final Passing Stories
[Stories scoring ≥ 4]
## 🔧 Corrected Stories
[Original → Final version comparison, noting correction round]
## ➕ Split-Generated Stories
[New Stories from splits]
## 🗑️ Recommended for Removal
[Stories not matching requirements or duplicates]
## 📋 Final Story List
[Complete integrated list, ready for use]
Correction Detail Format
### 🔧 US-XXX: [Title]
**Original Version**:
> As a [role], I want [action], so that [value].
**Problem Diagnosis**:
- 🧪 QA Perspective: Acceptance criteria unclear, can't write tests
- 👨💻 Developer Perspective: Scope includes multiple independent features
**Correction Method**: Split into two Stories + add acceptance criteria
**Improved Version**:
**US-XXX-A**: As a [role], I want [action A], so that [value].
- Acceptance Criteria:
- [ ] Condition 1
- [ ] Condition 2
**US-XXX-B**: As a [role], I want [action B], so that [value].
- Acceptance Criteria:
- [ ] Condition 1
---
Special Situation Handling
Situation 1: Large Number of Stories Need Correction (>50%)
This may indicate systematic issues in Story Writer phase:
- Don't correct one by one (too inefficient)
- Identify common problem patterns
- Propose systematic suggestions
- Recommend re-running Story Writer
Situation 2: Discovered Missing Features
If comparing to RFP reveals features not covered by Stories:
- Mark as "recommended addition"
- Produce suggested Story
- Mark source (derived from which part of RFP)
Situation 3: Discovered Duplicate Stories
- Mark duplicate items
- Recommend which to keep (or merge)
- Explain judgment basis
Situation 4: Story Quality Is Excellent
If all Stories score ≥ 4:
- Briefly confirm "Quality is good, no corrections needed"
- Can provide minor optimization suggestions (not mandatory)
- Directly output final list
Output Example
Refer to assets/refine-example.md for complete output example.
Reference Documents
- Evaluation Criteria:
references/evaluation-criteria.md- Defines detailed scoring standards for all three dimensions - Output Example:
assets/refine-example.md- Complete refinement report example
Integration with Other Skills
Standard Flow
[rfp-analyzer] → [story-writer] → [story-refiner] → Final output
Usage: After Story Writer produces User Stories draft, use Story Refiner to evaluate quality and automatically correct low-scoring Stories. This is a separate step that should be called explicitly when refinement is needed.
Quality Threshold Settings
Default Threshold
- Pass threshold: ≥ 4 points
- Must correct: ≤ 2 points
- Observation zone: 3 points (optional correction)
Strict Mode
When user requests "strict check" or project risk is higher:
- Pass threshold: 5 points
- Must correct: ≤ 3 points
- All Stories must have acceptance criteria
Lenient Mode
When user requests "quick pass" or project is MVP/POC:
- Pass threshold: ≥ 3 points
- Only correct ≤ 1 point severe issues
- Acceptance criteria optional
Checklist
After completing refinement, confirm the following items:
- [ ] All Stories ≤ 2 points have been corrected or rewritten
- [ ] Corrected Stories meet INVEST principles
- [ ] Split-generated new Stories have proper numbering
- [ ] Final list has no duplicates
- [ ] All original requirement coverage preserved
- [ ] Clear annotation of which are original vs. improved versions
- [ ] Termination reason is reasonable (not forced stop from reaching limit)
- [ ] No Story was changed back-and-forth across multiple rounds
Iterative vs. Single-Pass Refinement
When to Use Iterative (Default)
- Formal projects
- Story count > 10
- Has split operations
- Higher quality requirements
When to Use Single-Pass
When user explicitly says "quick refine" or "one pass only":
- MVP/POC projects
- Time pressure
- Story count < 10
- General quality requirements
Why 3 Round Limit
- Rule of thumb: Most problems resolved within 2 rounds
- Diminishing returns: Round 3+ corrections are usually nitpicking
- Avoid over-engineering: Infinite refinement may drift from original requirements
- Time cost: Each round requires processing time
If large numbers of low-scoring Stories remain after 3 rounds:
1. Output current results with annotations
2. Suggest returning to Story Writer to regenerate
3. Analyze whether RFP itself has systematic issues
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.