Master LLM-as-a-Judge evaluation techniques including direct scoring, pairwise comparison, rubric generation, and bias mitigation. Use when building evaluation systems, comparing model outputs, or...
Use when clarifying fuzzy boundaries, defining quality criteria, teaching by counterexample, preventing common mistakes, setting design guardrails, disambiguating similar concepts, refining...
Guide users through creating Agent Skills for Claude Code. Use when the user wants to create, write, author, or design a new Skill, or needs help with SKILL.md files, frontmatter, or skill structure.
Guide users through creating Agent Skills for Claude Code. Use when the user wants to create, write, author, or design a new Skill, or needs help with SKILL.md files, frontmatter, or skill structure.
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise...
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise...
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise...
Sync GitHub Copilot agents for FF15-inspired OpenSpec workflow (non-MCP version). Team includes Noctis (orchestrator + OpenSpec creator), Iris (issue management), Gladiolus (implementation),...
Expert JavaScript developer specializing in modern ES2023+ features, Node.js runtime environments, and asynchronous programming patterns. This agent excels at writing clean, performant JavaScript...
Create production-quality Android applications following Google's official Android architecture guidance with Kotlin, Jetpack Compose, MVVM architecture, Hilt dependency injection, Room database,...
Apply documentation standards: comment why not what, minimal comments (prefer clear code), maintain README with quick start, update docs with breaking changes. Use when writing comments, creating...
Generate Vitest + React Testing Library tests for Dify frontend components, hooks, and utilities. Triggers on testing, spec files, coverage, Vitest, RTL, unit tests, integration tests, or...
Expert data engineer for ETL/ELT pipelines, streaming, data warehousing. Activate on: data pipeline, ETL, ELT, data warehouse, Spark, Kafka, Airflow, dbt, data modeling, star schema, streaming...
Use @effect/platform abstractions for cross-platform file I/O, process spawning, HTTP clients, and terminal operations. Apply this skill when writing code that interacts with the filesystem,...
Comprehensive code review for diffs. Analyzes changed code for security vulnerabilities, anti-patterns, and quality issues. Auto-detects domain (frontend/backend) from file paths.
Use when writing or polishing professional scientific emails, journal cover letters, or responses to reviewers. Invoke when user mentions email to collaborator, cover letter to editor, reviewer...
Generate comprehensive, publication-quality technical manuals with thematic storytelling using multi-agent orchestration. Use when user asks for themed documentation, narrative technical guides,...
Research any topic from the last 30 days on Reddit + X + Web, synthesize findings, and write copy-paste-ready prompts. Use when the user wants recent social/web research on a topic, asks "what are...
Expert PR and media relations guidance for earned media, press coverage, and reputation building. Use when writing press releases, crafting media pitches, developing journalist relationships,...
Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts,...