Help users build effective AI applications. Use when someone is building with LLMs, writing prompts, designing AI features, implementing RAG, creating agents, running evals, or trying to improve...
Professional UI design and frontend interface guidelines. Use this skill when creating web pages, mini-program interfaces, prototypes, or any frontend UI components that require distinctive,...
Detects common LLM coding agent artifacts in codebases. Identifies test quality issues, dead code, over-abstraction, and verbose LLM style patterns. Use when cleaning up AI-generated code or...
Build, review, debug, and modernize SwiftUI apps for macOS with modern patterns. Use when building SwiftUI UIs, reviewing code quality, debugging view issues, checking anti-patterns, migrating...
Expert in photo content recognition, intelligent curation, and quality filtering. Specializes in face/animal/place recognition, perceptual hashing for de-duplication, screenshot/meme detection,...
Generate a project-specific DESIGN_SYSTEM.md that enforces consistent UI/UX across SPAs, traditional server-rendered sites, and hybrid systems. Includes tokens, component rules, accessibility...
Reimplement the current Git branch on a fresh branch off `main` with a clean, narrative-quality commit history.
Performs thorough code reviews on Android Kotlin/Java code. Use when reviewing pull requests, analyzing code quality, checking architecture patterns, or validating Android best practices. Covers...
Set up Biome (default) or ESLint + Prettier, Vitest testing, and pre-commit hooks for any JavaScript/TypeScript project. Uses Bun as the package manager. Use this skill when initializing code...
Coding standards for readable, maintainable, testable code including SOLID principles, clean code practices, DDD, and TDD. Use when implementing new features, refactoring code, performing code...
Guide for creating Claude Code skills following Anthropic's official best practices. Use when user wants to create a new skill, build a skill, write SKILL.md, or needs skill creation guidelines....
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking...
This skill should be used when the user asks to "evaluate a DSPy program", "test my DSPy module", "measure performance", "create evaluation metrics", "use answer_exact_match or SemanticF1",...
Senior Code Architect & Quality Assurance Engineer for 2026. Specialized in context-aware AI code reviews, automated PR auditing, and technical debt mitigation. Expert in neutralizing "AI-Smells,"...
AI coding agent that plans, implements, and verifies code changes with human approval gates. Built on FlatAgents.
Ingest data from S3 into bauplan using the Write-Audit-Publish pattern for safe data loading. Use when loading new data from S3, performing safe data ingestion, or when the user mentions WAP, data...
Implement comprehensive observability for LLM applications including tracing (Langfuse/Helicone), cost tracking, token optimization, RAG evaluation metrics (RAGAS), hallucination detection, and...
Guides creation and editing of SKILL.md files following Anthropic best practices and this repo's conventions. Use when creating a new skill, editing an existing skill, porting a skill from another...
Complete subscription billing system with Stripe integration, feature flags for plan gating, webhook handling, and billing portal.