Automated code review workflow using OpenAI Codex CLI. Implements iterative fix-and-review cycles until code passes validation or reaches iteration limit. Use when building features requiring...
Guide systematic investigation of production incidents including triage, data gathering, impact assessment, and root cause analysis. Use when investigating outages, service degradation, production...
Stand up a Design Engineering practice (hybrid design+engineering) by producing a Design Engineering Execution Pack: charter, prototype→production workflow, design-to-code contract, component...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or...
Use when prompts produce inconsistent or unreliable outputs, need explicit structure and constraints, require safety guardrails or quality checks, involve multi-step reasoning that needs...
Translates EPUB ebook files between languages with parallel processing. Supports Japanese, English, Chinese, and other languages. Handles large files by splitting into sections, manages multiple...
Comprehensive security vulnerability scanner for Python projects including Flask, Django, and FastAPI applications. Detects OWASP Top 10 vulnerabilities, injection flaws, insecure deserialization,...
Plan and execute a product launch/release with speed and safety. Produces a Shipping & Launch Pack (release brief, rollout/rollback plan, product quality list, comms + enablement, monitoring plan,...
Use when user needs Active Directory security analysis, privileged group design review, authentication policy assessment, or delegation and attack surface evaluation across enterprise domains.
Scaffold a production-ready full-stack monorepo with working MVP features, tests, and CI/CD. Generates complete CRUD functionality, Clerk authentication, and quality gates that run immediately...
Create an AI Evals Pack (eval PRD, test set, rubric, judge plan, results + iteration loop). Use for LLM evaluation, benchmarks, rubrics, error analysis/open coding, and ship/no-ship quality gates...
Execute a single phase from an implementation plan with all quality gates. This skill is the unit of work for implement-plan, handling implementation, verification, code review, ADR compliance,...
Refactor and clean code using SOLID principles, clean code patterns, and modern best practices. Use when the user asks to refactor code, improve code quality, fix code smells, apply design...
Analyze AI/ML technical content (papers, articles, blog posts) and extract actionable insights filtered through enterprise AI engineering lens. Use when user provides URL/document for AI/ML...
Use when designing organizational structure (team topologies, Conway's Law alignment), mapping stakeholders by power-interest for change initiatives, defining team interface contracts (APIs, SLAs,...
Systematic 7-step methodology for comprehensive patent prior art searches and patentability assessments using BigQuery and CPC classification
Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Includes tech debt analyzer, team scaling calculator, engineering metrics frameworks,...