Use when adding new error messages to React, or seeing "unknown error code" warnings.
npx skills add miles-knowbl/orchestrator --skill "pipeline-discovery"
Install specific skill from multi-skill repository
# Description
Identify backend data pipelines (P-series) in the codebase. Discovers server-side data flows triggered by user actions or system events, documenting triggers, steps, and outcomes. Foundation for MECE failure mode analysis.
# SKILL.md
name: pipeline-discovery
description: "Identify backend data pipelines (P-series) in the codebase. Discovers server-side data flows triggered by user actions or system events, documenting triggers, steps, and outcomes. Foundation for MECE failure mode analysis."
phase: INIT
category: core
version: "1.0.0"
depends_on: [requirements]
tags: [audit, pipeline, discovery, backend, data-flow]
Pipeline Discovery
Identify backend data pipelines (P-series).
When to Use
- Starting an audit β Runs in INIT phase to map backend flows
- Understanding data flows β Document how data moves through the system
- Preparing for failure mode analysis β Identify what can break
- When you say: "find the pipelines", "map the backend", "what data flows exist?"
Reference Requirements
MUST read before applying this skill:
| Reference | Why Required |
|---|---|
pipeline-identification.md |
How to find pipelines in code |
pipeline-template.md |
How to document each pipeline |
Read if applicable:
| Reference | When Needed |
|---|---|
common-patterns.md |
Recognize typical pipeline patterns |
Verification: All major backend data flows are documented with triggers and outcomes.
Required Deliverables
| Deliverable | Location | Condition |
|---|---|---|
| Pipeline inventory | AUDIT-SCOPE.md |
Always (P-series section) |
| State update | audit-state.json |
Always (backend_pipelines array) |
Core Concept
Pipeline Discovery answers: "What are the major backend data flows?"
A pipeline is:
- Triggered by user action or system event
- Processes data through multiple steps
- Produces a persistent outcome
Examples:
- P1: Source Ingestion (file upload β parsed schema)
- P2: Content Generation (generate button β artifact created)
- P3: Publishing (publish button β post live on platform)
Pipeline Identification
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PIPELINE DISCOVERY PROCESS β
β β
β 1. FIND ENTRY POINTS β
β βββ API routes (POST, PUT, DELETE) β
β βββ Background job handlers β
β βββ Webhook receivers β
β βββ Event listeners β
β β
β 2. TRACE DATA FLOW β
β βββ Input validation/parsing β
β βββ Business logic β
β βββ External service calls β
β βββ Database writes β
β β
β 3. DOCUMENT EACH PIPELINE β
β βββ Trigger (what starts it) β
β βββ Steps (what happens) β
β βββ Outcome (what it produces) β
β β
β 4. ASSIGN P-SERIES IDS β
β βββ P1, P2, P3... in order of discovery β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Where to Look
API Routes
// Next.js API routes
app/api/*/route.ts
// Express routes
routes/*.ts
controllers/*.ts
Background Jobs
// Job processors
jobs/*.ts
workers/*.ts
queues/*.ts
Database Operations
// Database writes indicate pipeline endpoints
supabase.from('table').insert()
prisma.model.create()
External Services
// External API calls often indicate pipelines
openai.chat.completions.create()
twitter.post()
Pipeline Documentation Format
### P1: Source Ingestion
**Trigger:** User uploads file via /sources/upload
**Frequency:** ~50/day
**Steps:**
1. File received at `api/sources/upload/route.ts:23`
2. File type validated (`lib/validators.ts:45`)
3. Content parsed by type (`lib/parsers/index.ts:12`)
4. Schema extracted (`lib/schema-extractor.ts:78`)
5. Source record created (`lib/db/sources.ts:34`)
6. Embedding generated (`lib/embeddings.ts:56`)
**Outcome:**
- `sources` table: new row with metadata
- `source_embeddings` table: vector for search
- `source_schema` JSON: extracted structure
**Key Files:**
- `api/sources/upload/route.ts`
- `lib/parsers/*.ts`
- `lib/schema-extractor.ts`
Output Format
In AUDIT-SCOPE.md
## Backend Pipelines (P-series)
| ID | Name | Trigger | Outcome |
|----|------|---------|---------|
| P1 | Source Ingestion | File upload | source_schema populated |
| P2 | Content Generation | Generate button | Artifact created |
| P3 | Publishing | Publish button | Post live on platform |
### P1: Source Ingestion
[detailed documentation]
### P2: Content Generation
[detailed documentation]
In audit-state.json
{
"backend_pipelines": [
{
"id": "P1",
"name": "Source Ingestion",
"trigger": "File upload via /sources/upload",
"outcome": "source_schema populated",
"key_files": ["api/sources/upload/route.ts", "lib/parsers/index.ts"],
"step_count": 6
}
]
}
Discovery Checklist
- [ ] All POST/PUT/DELETE API routes examined
- [ ] Background job handlers identified
- [ ] Database write operations traced
- [ ] External API calls documented
- [ ] Each pipeline has trigger, steps, outcome
- [ ] P-series IDs assigned consistently
Common Pipeline Patterns
| Pattern | Example | Indicators |
|---|---|---|
| CRUD Create | User registration | POST route β validate β insert |
| File Processing | Document upload | POST multipart β parse β store |
| Generation | AI content | POST β LLM call β store result |
| Publishing | Social post | POST β external API β update status |
| Batch Job | Daily report | Cron β query β aggregate β email |
Validation
Before completing, verify:
- [ ] All major data flows are documented
- [ ] Each pipeline has a unique P-series ID
- [ ] Triggers are user-observable actions or system events
- [ ] Outcomes are persistent (database writes, external effects)
- [ ] Key files are identified for each pipeline
- [ ] Step counts are accurate
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.